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For as the heavens are higher than the earth, so are my ways higher than 
your ways, and my thoughts than your thoughts, says the Lord. 

Isaiah 55:9 



Preface 


Ideas from quantum physics play important roles in many parts of modern 
mathematics. Many parts of representation theory, for example, are moti¬ 
vated by quantum mechanics, including the Wigner-Mackey theory of in¬ 
duced representations, the Kirillov-Kostant orbit method, and, of course, 
quantum groups. The Jones polynomial in knot theory, the Gromov-Witten 
invariants in topology, and mirror symmetry in algebraic topology are other 
notable examples. The awarding of the 1990 Fields Medal to Ed Witten, a 
physicist, gives an idea of the scope of the influence of quantum theory in 
mathematics. 

Despite the importance of quantum mechanics to mathematics, there is 
no easy way for mathematicians to learn the subject. Quantum mechan¬ 
ics books in the physics literature are generally not easily understood by 
most mathematicians. There is, of course, a lower level of mathematical 
precision in such books than mathematicians are accustomed to. In addi¬ 
tion, physics books on quantum mechanics assume knowledge of classical 
mechanics that mathematicians often do not have. And, finally, there is a 
subtle difference in “culture”—differences in terminology and notation— 
that can make reading the physics literature like reading a foreign language 
for the mathematician. There are few books that attempt to translate quan¬ 
tum theory into terms that mathematicians can understand. 

This book is intended as an introduction to quantum mechanics for math¬ 
ematicians with little prior exposure to physics. The twin goals of the book 
are (1) to explain the physical ideas of quantum mechanics in language 
mathematicians will be comfortable with, and (2) to develop the neces¬ 
sary mathematical tools to treat those ideas in a rigorous fashion. I have 
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attempted to give a reasonably comprehensive treatment of nonrelativistic 
quantum mechanics, including topics found in typical physics texts (e.g., 
the harmonic oscillator, the hydrogen atom, and the WKB approximation) 
as well as more mathematical topics (e.g., quantization schemes, the Stone- 
von Neumann theorem, and geometric quantization). I have also attempted 
to minimize the mathematical prerequisites. I do not assume, for example, 
any prior knowledge of spectral theory or unbounded operators, but pro¬ 
vide a full treatment of those topics in Chaps. 6 through 10 of the text. 
Similarly, I do not assume familiarity with the theory of Lie groups and 
Lie algebras, but provide a detailed account of those topics in Chap. 16. 
Whenever possible, I provide full proofs of the stated results. 

Most of the text will be accessible to graduate students in mathematics 
who have had a first course in real analysis, covering the basics of L 2 spaces 
and Hilbert spaces. Appendix A reviews some of the results that are used in 
the main body of the text. In Chaps. 21 and 23, however, I assume knowl¬ 
edge of the theory of manifolds. I have attempted to provide motivation for 
many of the definitions and proofs in the text, with the result that there 
is a fair amount of discussion interspersed with the standard definition- 
theorem-proof style of mathematical exposition. There are exercises at the 
end of each chapter, making the book suitable for graduate courses as well 
as for independent study. 

In comparison to the present work, classics such as Reed and Simon [34] 
and Glimm and Jaffe [14], along with the recent book of Schmiidgen [35], 
are more focused on the mathematical underpinnings of the theory than 
on the physical ideas. Hannabuss’s text [22] is fairly accessible to math¬ 
ematicians, but—despite the word “graduate” in the title of the series— 
uses an undergraduate level of mathematics. The recent book of Takhtajan 
[39], meanwhile, has an expository bent to it, but provides less physical 
motivation and is less self-contained than the present book. Whereas, for 
example, Takhtajan begins with Lagrangian and Hamiltonian mechanics 
on manifolds, I begin with “low-tech” classical mechanics on the real line. 
Similarly, Takhtajan assumes knowledge of unbounded operators and Lie 
groups, while I provide substantial expositions of both of those subjects. 
Finally, there is the work of Folland [13], which I highly recommend, but 
which deals with quantum field theory, whereas the present book treats 
only nonrelativistic quantum mechanics, except for a very brief discussion 
of quantum field theory in Sect. 20.6. 

The book begins with a quick introduction to the main ideas of classical 
and quantum mechanics. After a brief account in Chap. 1 of the historical 
origins of quantum theory, I turn in Chap. 2 to a discussion of the neces¬ 
sary background from classical mechanics. This includes Newton’s equa¬ 
tion in varying degrees of generality, along with a discussion of important 
physical quantities such as energy, momentum, and angular momentum, 
and conditions under which these quantities are “conserved” (i.e., constant 
along each solution of Newton’s equation). I give a short treatment here 
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of Poisson brackets and Hamilton’s form of Newton’s equation, deferring a 
full discussion of “fancy” classical mechanics to Chap. 21. 

In Chap. 3, 1 attempt to motivate the structures of quantum mechanics in 
the simplest setting. Although I discuss the “axioms” (in standard physics 
terminology) of quantum mechanics, I resolutely avoid a strictly axiomatic 
approach to the subject (using, say, C*-algebras). Rather, I try to provide 
some motivation for the position and momentum operators and the Hilbert 
space approach to quantum theory, as they connect to the probabilistic as¬ 
pect of the theory. I do not attempt to explain the strange probabilistic 
nature of quantum theory, if, indeed, there is any explanation of it. Rather, 
I try to elucidate how the wave function, along with the position and mo¬ 
mentum operators, encodes the relevant probabilities. 

In Chaps. 4 and 5, we look into two illustrative cases of the Schrodinger 
equation in one space dimension: a free particle and a particle in a square 
well. In these chapters, we encounter such important concepts as the dis¬ 
tinction between phase velocity and group velocity and the distinction be¬ 
tween a discrete and a continuous spectrum. 

In Chaps. 6 through 10, we look into some of the technical mathematical 
issues that are swept under the carpet in earlier chapters. I have tried to 
design this section of the book in such a way that a reader can take in as 
much or as little of the mathematical details as desired. For a reader who 
simply wants the big picture, I outline the main ideas and results of spec¬ 
tral theory in Chap. 6, including a discussion of the prototypical example 
of an operator with a continuous spectrum: the momentum operator. For 
a reader who wants more information, I provide statements of the spec¬ 
tral theorem (in two different forms) for bounded self-adjoint operators in 
Chap. 7, and an introduction to the notion of unbounded self-adjoint op¬ 
erators in Chap. 9. Finally, for the reader who wants all the details, I give 
proofs of the spectral theorem for bounded and unbounded self-adjoint 
operators, in Chaps. 8 and 10, respectively. 

In Chaps. 11 through 14, we turn to the vitally important canonical com¬ 
mutation relations. These are used in Chap. 11 to derive algebraically the 
spectrum of the quantum harmonic oscillator. In Chap. 12, we discuss the 
uncertainty principle, both in its general form (for arbitrary pairs of non¬ 
commuting operators) and in its specific form (for the position and momen¬ 
tum operators). We pay careful attention to subtle domain issues that are 
usually glossed over in the physics literature. In Chap. 13, we look at differ¬ 
ent “quantization schemes” (i.e., different ways of ordering products of the 
noncommuting position and momentum operators). In Chap. 14, we turn to 
the celebrated Stone-von Neumann theorem, which provides a uniqueness 
result for representations of the canonical commutation relations. As in the 
case of the uncertainty principle, there are some subtle domain issues here 
that require attention. 

In Chaps. 15 through 18, we examine some less elementary issues in quan¬ 
tum theory. Chapter 15 addresses the WKB (Wentzel-Kramers-Brillouin) 
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approximation, which gives simple but approximate formulas for the eigen¬ 
vectors and eigenvalues for the Hamiltonian operator in one dimension. 
After this, we introduce (Chap. 16) the notion of Lie groups, Lie alge¬ 
bras, and their representations, all of which play an important role in 
many parts of quantum mechanics. In Chap. 17, we consider the example 
of angular momentum and spin, which can be understood in terms of the 
representations of the rotation group S0(3). Here a more mathematical 
approach—especially the relationship between Lie group representations 
and Lie algebra representations—can substantially clarify a topic that is 
rather mysterious in the physics literature. In particular, the concept of 
“fractional spin” can be understood as describing a representation of the 
Lie algebra of the rotation group for which there is no associated represen¬ 
tation of the rotation group itself. In Chap. 18, we illustrate these ideas by 
describing the energy levels of the hydrogen atom, including a discussion 
of the hidden symmetries of hydrogen, which account for the “accidental 
degeneracy” in the levels. In Chap. 19, we look more closely at the concept 
of the “state” of a system in quantum mechanics. We look at the notion 
of subsystems of a quantum system in terms of tensor products of Hilbert 
spaces, and we see in this setting that the notion of “pure state” (a unit 
vector in the relevant Hilbert space) is not adequate. We are led, then, to 
the notion of a mixed state (or density matrix). We also examine the idea 
that, in quantum mechanics, “identical particles are indistinguishable.” 

Finally, in Chaps. 21 through 23, we examine some advanced topics in 
classical and quantum mechanics. We begin, in Chap. 20, by considering the 
path integral formulation of quantum mechanics, both from the heuristic 
perspective of the Feynman path integral, and from the rigorous perspective 
of the Feynman-Kac formula. Then, in Chap. 21, we give a brief treatment 
of Hamiltonian mechanics on manifolds. Finally, we consider the machinery 
of geometric quantization, beginning with the Euclidean case in Chap. 22 
and continuing with the general case in Chap. 23. 

I am grateful to all who have offered suggestions or made corrections 
to the manuscript, including Renato Bettiol, Edward Burkard, Matt Cecil, 
Tiancong Chen, Bo Jacoby, Will Kirwin, Nicole Kroeger, Wicharn Lewkeer- 
atiyutkul, Jeff Mitchell, Eleanor Pettus, Ambar Sengupta, and Augusto 
Stoffel. I am particularly grateful to Michel Talagrand who read almost 
the entire manuscript and made numerous corrections and suggestions. Fi¬ 
nally, I offer a special word of thanks to my advisor and friend, Leonard 
Gross, who started me on the path toward understanding the mathemati¬ 
cal foundations of quantum mechanics. Readers are encouraged to send me 
comments or corrections at bhall@nd.edu. 


Notre Dame, IN, USA 


Brian C. Hall 
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1 

The Experimental Origins of Quantum 
Mechanics 


Quantum mechanics, with its controversial probabilistic nature and curious 
blending of waves and particles, is a very strange theory. It was not 
invented because anyone thought this is the way the world should behave, 
but because various experiments showed that this is the way the world 
does behave, like it or not. Craig Hogan, director of the Fermilab Particle 
Astrophysics Center, put it this way: 

No theorist in his right mind would have invented quantum 

mechanics unless forced to by data. 1 

Although the first hint of quantum mechanics came in 1900 with Planck’s 
solution to the problem of blackbody radiation, the full theory did not 
emerge until 1925-1926, with Heisenberg’s matrix model, Schrodinger’s 
wave model, and Born’s statistical interpretation of the wave model. 


1.1 Is Light a Wave or a Particle? 

1.1.1 Newton Versus Huygens 

Beginning in the late seventeenth century and continuing into the early 
eighteenth century, there was a vigorous debate in the scientific community 


1 Quoted in “Is Space Digital?” by Michael Moyer, Scientific American , February 
2012, pp. 30-36. 
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over the nature of light. One camp, following the views of Isaac 
Newton, claimed that light consisted of a group of particles or “corpus¬ 
cles.” The other camp, led by the Dutch physicist Christiaan Huygens, 
claimed that light was a wave. Newton argued that only a corpuscular the¬ 
ory could account for the observed tendency of light to travel in straight 
lines. Huygens and others, on the other hand, argued that a wave theory 
could explain numerous observed aspects of light, including the bending 
or “refraction” of light as it passes from one medium to another, as from 
air into water. Newton’s reputation was such that his “corpuscular” theory 
remained the dominant one until the early nineteenth century. 


1.1.2 The Ascendance of the Wave Theory of Light 

In 1804, Thomas Young published two papers describing and explaining 
his double-slit experiment. In this experiment, sunlight passes through a 
small hole in a piece of cardboard and strikes another piece of cardboard 
containing two small holes. The light then strikes a third piece of cardboard, 
where the pattern of light may be observed. Young observed “fringes” or 
alternating regions of high and low intensity for the light. Young believed 
that light was a wave and he postulated that these fringes were the result 
of interference between the waves emanating from the two holes. Young 
drew an analogy between light and water, where in the case of water, 
interference is readily observed. If two circular waves of water cross each 
other, there will be some points where a peak of one wave matches up with 
a trough of another wave, resulting in destructive interference , that is, a 
partial cancellation between the two waves, resulting in a small amplitude 
of the combined wave at that point. At other points, on the other hand, a 
peak in one wave will line up with a peak in the other, or a trough with 
a trough. At such points, there is constructive interference , with the result 
that the amplitude of the combined wave is large at that point. The pattern 
of constructive and destructive interference will produce something like a 
checkerboard pattern of alternating regions of large and small amplitudes 
in the combined wave. The dimensions of each region will be roughly on 
the order of the wavelength of the individual waves. 

Based on this analogy with water waves, Young was able to explain the 
interference fringes that he observed and to predict the wavelength that 
light must have in order for the specific patterns he observed to occur. 
Based on his observations, Young claimed that the wavelength of visible 
light ranged from about 1/36,000 in. (about 700 nm) at the red end of the 
spectrum to about 1/60,000 in. (about 425 nm) at the violet end of the 
spectrum, results that agree with modern measurements. 

Figure 1.1 shows how circular waves emitted from two different points 
form an interference pattern. One should think of Young’s second piece of 
cardboard as being at the top of the figure, with holes near the top left and 
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FIGURE 1.1. Interference of waves emitted from two slits. 


top right of the figure. Figure 1.2 then plots the intensity (i.e., the square of 
the displacement) as a function of x , with y having the value corresponding 
to the bottom of Fig. 1.1. 

Despite the convincing nature of Young’s experiment, many proponents 
of the corpuscular theory of light remained unconvinced. In 1818, the 
French Academy of Sciences set up a competition for papers explaining 
the observed properties of light. One of the submissions was a paper by 
Augustin-Jean Fresnel in which he elaborated on Huygens’s wave model 
of refraction. A supporter of the corpuscular theory of light, Simeon-Denis 
Poisson read Fresnel’s submission and ridiculed it by pointing out that 
if that theory were true, light passing by an opaque disk would diffract 
around the edges of the disk to produce a bright spot in the center of the 
shadow of the disk, a prediction that Poisson considered absurd. Never¬ 
theless, the head of the judging committee for the competition, Frangois 
Arago, decided to put the issue to an experimental test and found that 
such a spot does in fact occur. Although this spot is often called “Arago’s 
spot,” or even, ironically, “Poisson’s spot,” Arago eventually realized that 
the spot had been observed 100 years earlier in separate experiments by 
Delisle and Maraldi. 

Arago’s observation of Poisson’s spot led to widespread acceptance of 
the wave theory of light. This theory gained even greater acceptance in 
1865, when James Clerk Maxwell put together what are today known as 
Maxwell’s equations. Maxwell showed that his equations predicted that 
electromagnetic waves would propagate at a certain speed, which agreed 
with the observed speed of light. Maxwell thus concluded that light is sim¬ 
ply an electromagnetic wave. From 1865 until the end of the nineteenth 
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FIGURE 1.2. Intensity plot for a horizontal line across the bottom of Fig. 1.1 


century, the debate over the wave-versus-particle nature of light was con¬ 
sidered to have been conclusively settled in favor of the wave theory. 

1.1.3 Blackbody Radiation 

In the early twentieth century, the wave theory of light began to experience 
new challenges. The first challenge came from the theory of blackbody radia¬ 
tion. In physics, a blackbody is an idealized object that perfectly absorbs all 
electromagnetic radiation that hits it. A blackbody can be approximated in 
the real world by an object with a highly absorbent surface such as “lamp 
black.” The problem of blackbody radiation concerns the distribution of 
electromagnetic radiation in a cavity within a blackbody. Although the 
walls of the blackbody absorb the radiation that hits it, thermal vibrations 
of the atoms making up the walls cause the blackbody to emit electromag¬ 
netic radiation. (At normal temperatures, most of the radiation emitted 
would be in the infrared range.) 

In the cavity, then, electromagnetic radiation is constantly absorbed and 
re-emitted until thermal equilibrium is reached, at which point the absorp¬ 
tion and emission of radiation are perfectly balanced at each frequency. 
According to the “equipartition theorem” of (classical) statistical mechan¬ 
ics, the energy in any given mode of electromagnetic radiation should be 
exponentially distributed, with an average value equal to fc^T, where T is 
the temperature and ks is Boltzmann’s constant. (The temperature should 
be measured on a scale where absolute zero corresponds to T = 0.) The dif¬ 
ficulty with this prediction is that the average amount of energy is the same 
for every mode (hence the term “equipartition”). Thus, once one adds up 
over all modes—of which there are infinitely many—the predicted amount 
of energy in the cavity is infinite. This strange prediction is referred to as 
the ultraviolet catastrophe , since the infinitude of the energy comes from the 
ultraviolet (high-frequency) end of the spectrum. This ultraviolet catastro¬ 
phe does not seem to make physical sense and certainly does not match up 
with the observed energy spectrum within real-world blackbodies. 
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An alternative prediction of the blackbody energy spectrum was offered 
by Max Planck in a paper published in 1900. Planck postulated that 
the energy in the electromagnetic field at a given frequency u should be 
“quantized,” meaning that this energy should come only in integer mul¬ 
tiples of a certain basic unit equal to hut, where h is a constant, which 
we now call Planck’s constant. Planck postulated that the energy would 
again be exponentially distributed, but only over integer multiples of tuo. 
At low frequencies, Planck’s theory predicts essentially the same energy as 
in classical statistical mechanics. At high frequencies, namely at frequen¬ 
cies where Hoj is large compared to ksT, Planck’s theory predicts a rapid 
fall-off of the average energy (see Exercise 2 for details). Indeed, if we mea¬ 
sure mass, distance, and time in units of grams, centimeters, and seconds, 
respectively, and we assign h the numerical value 

h = 1.054 x 10~ 27 , 

then Planck’s predictions match the experimentally observed blackbody 
spectrum. 

Planck pictured the walls of the blackbody as being made up of inde¬ 
pendent oscillators of different frequencies, each of which is restricted to 
have energies of tuo. Although this picture was clearly not intended as a 
realistic physical explanation of the quantization of electromagnetic energy 
in blackbodies, it does suggest that Planck thought that energy quantiza¬ 
tion arose from properties of the walls of the cavity, rather than in intrinsic 
properties of the electromagnetic radiation. Einstein, on the other hand, in 
assessing Planck’s model, argued that energy quantization was inherent in 
the radiation itself. In Einstein’s picture, then, electromagnetic energy at 
a given frequency—whether in a blackbody cavity or not—comes in pack¬ 
ets or quanta having energy proportional to the frequency. Each quantum 
of electromagnetic energy constitutes what we now call a photon , which 
we may think of as a particle of light. Thus, Planck’s model of blackbody 
radiation began a rebirth of the particle theory of light. 

It is worth mentioning, in passing, that in 1900, the same year in which 
Planck’s paper on blackbody radiation appeared, Lord Kelvin gave a lec¬ 
ture that drew attention to another difficulty with the classical theory 
of statistical mechanics. Kelvin described two “clouds” over nineteenth- 
century physics at the dawn of the twentieth century. The first of these 
clouds concerned aether—a hypothetical medium through which electro¬ 
magnetic radiation propagates—and the failure of Michelson and Morley to 
observe the motion of earth relative to the aether. Under this cloud lurked 
the theory of special relativity. The second of Kelvin’s clouds concerned 
heat capacities in gases. The equipartition theorem of classical statisti¬ 
cal mechanics made predictions for the ratio of heat capacity at constant 
pressure (c p ) and the heat capacity at constant volume (c v ). These pre¬ 
dictions deviated substantially from the experimentally measured ratios. 
Under the second cloud lurked the theory of quantum mechanics, because 
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the resolution of this discrepancy is similar to Planck’s resolution of the 
blackbody problem. As in the case of blackbody radiation, quantum me¬ 
chanics gives rise to a correction to the equipartition theorem, thus result¬ 
ing in different predictions for the ratio of c p to c v , predictions that can be 
reconciled with the observed ratios. 

1.1.4 The Photoelectric Effect 

The year 1905 was Einstein’s annus mirabilis (miraculous year), in which 
Einstein published four ground-breaking papers, two on the special theory 
of relativity and one each on Brownian motion and the photoelectric effect. 
It was for the photoelectric effect that Einstein won the Nobel Prize in 
physics in 1921. In the photoelectric effect, electromagnetic radiation strik¬ 
ing a metal causes electrons to be emitted from the metal. Einstein found 
that as one increases the intensity of the incident light, the number of emit¬ 
ted electrons increases, but the energy of each electron does not change. 
This result is difficult to explain from the perspective of the wave theory of 
light. After all, if light is simply an electromagnetic wave, then increasing 
the intensity of the light amounts to increasing the strength of the electric 
and magnetic fields involved. Increasing the strength of the fields, in turn, 
ought to increase the amount of energy transferred to the electrons. 

Einstein’s results, on the other hand, are readily explained from a particle 
theory of light. Suppose light is actually a stream of particles (photons) with 
the energy of each particle determined by its frequency. Then increasing 
the intensity of light at a given frequency simply increases the number of 
photons and does not affect the energy of each photon. If each photon has 
a certain likelihood of hitting an electron and causing it to escape from 
the metal, then the energy of the escaping electron will be determined 
by the frequency of the incident light and not by the intensity of that 
light. The photoelectric effect, then, provided another compelling reason 
for believing that light can behave in a particlelike manner. 

1.1.5 The Double-Slit Experiment, Revisited 

Although the work of Planck and Einstein suggests that there is a par¬ 
ticlelike aspect to light, there is certainly also a wavelike aspect to light, 
as shown by Young, Arago, and Maxwell, among others. Thus, somehow, 
light must in some situations behave like a wave and in some situations 
like a particle, a phenomenon known as “wave-particle duality.” William 
Lawrence Bragg described the situation thus: 

God runs electromagnetics on Monday, Wednesday, and Friday 
by the wave theory, and the devil runs them by quantum theory 
on Tuesday, Thursday, and Saturday. 

(Apparently Sunday, being a day of rest, did not need to be accounted for.) 
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In particular, we have already seen that Young’s double-slit experiment 
in the early nineteenth century was one important piece of evidence in fa¬ 
vor of the wave theory of light. If light is really made up of particles, as 
blackbody radiation and the photoelectric effect suggest, one must give a 
particle-based explanation of the double-slit experiment. J. J. Thomson sug¬ 
gested in 1907 that the patterns of light seen in the double-slit experiment 
could be the result of different photons somehow interfering with one an¬ 
other. Thomson thus suggested that if the intensity of light were sufficiently 
reduced, the photons in the light would become widely separated and the 
interference pattern might disappear. In 1909, Geoffrey Ingram Taylor set 
out to test this suggestion and found that even when the intensity of light 
was drastically reduced (to the point that it took three months for one of 
the images to form), the interference pattern remained the same. 

Since Taylor’s results suggest that interference remains even when the 
photons are widely separated, the photons are not interfering with one an¬ 
other. Rather, as Paul Dirac put it in Chap. 1 of [6], “Each photon then 
interferes only with itself.” To state this in a different way, since there is no 
interference when there is only one slit, Taylor’s results suggest that each 
individual photon passes through both slits. By the early 1960s, it became 
possible to perform double-slit experiments with electrons instead of pho¬ 
tons, yielding even more dramatic confirmations of the strange behavior of 
matter in the quantum realm. (See Sect. 1.2.4.) 


1.2 Is an Electron a Wave or a Particle? 

In the early part of the twentieth century, the atomic theory of matter 
became firmly established. (Einstein’s 1905 paper on Brownian motion was 
an important confirmation of the theory and provided the first calculation 
of atomic masses in everyday units.) Experiments performed in 1909 by 
Hans Geiger and Ernest Marsden, under the direction of Ernest Rutherford, 
led Rutherford to put forward in 1911 a picture of atoms in which a small 
nucleus contains most of the mass of the atom. In Rutherford’s model, 
each atom has a positively charged nucleus with charge nq , where n is 
a positive integer (the atomic number) and q is the basic unit of charge 
first observed in Millikan’s famous oil-drop experiment. Surrounding the 
nucleus is a cloud of n electrons, each having negative charge — q. When 
atoms bind into molecules, some of the electrons of one atom may be shared 
with another atom to form a bond between the atoms. This picture of atoms 
and their binding led to the modern theory of chemistry. 

Basic to the atomic theory is that electrons are particles; indeed, the 
number of electrons per atom is supposed to be the atomic number. Never¬ 
theless, it did not take long after the atomic theory of matter was confirmed 
before wavelike properties of electrons began to be observed. The situation, 
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then, is the reverse of that with light. While light was long thought to be 
a wave (at least from the publication of Maxwell’s equations in 1865 until 
Planck’s work in 1900) and was only later seen to have particlclike behavior, 
electrons were initially thought to be particles and were only later seen to 
have wavelike properties. In the end, however, both light and electrons have 
both wavelike and particlelike properties. 


1.2.1 The Spectrum of Hydrogen 

If electricity is passed through a tube containing hydrogen gas, the gas will 
emit light. If that light is separated into different frequencies by means 
of a prism, bands will become apparent, indicating that the light is not a 
continuous mix of many different frequencies, but rather consists only of a 
discrete family of frequencies. In view of the photonic theory of light, the 
energy in each photon is proportional to its frequency. Thus, each observed 
frequency corresponds to a certain amount of energy being transferred from 
a hydrogen atom to the electromagnetic field. 

Now, a hydrogen atom consists of a single proton surrounded by a single 
electron. Since the proton is much more massive than the electron, one 
can picture the proton as being stationary, with the electron orbiting it. 
The idea, then, is that the current being passed through the gas causes some 
of the electrons to move to a higher-energy state. Eventually, that electron 
will return to a lower-energy state, emitting a photon in the process. In this 
way, by observing the energies (or, equivalently, the frequencies) of the 
emitted photons, one can work backwards to the change in energy of the 
electron. 

The curious thing about the state of affairs in the preceding paragraph 
is that the energies of the emitted photons—and hence, also, the energies 
of the electron—come only in a discrete family of possible values. Based 
on the observed frequencies, Johannes Rydberg concluded in 1888 that the 
possible energies of the electron were of the form 


Ert 



( 1 . 1 ) 


Here, R is the “Rydberg constant,” given (in “Gaussian units”) by 


R = 


m e Q 4 
2H 2 


where Q is the charge of the electron and m e is the mass of the electron. 
(Technically, m e should be replaced by the reduced mass y of the proton- 
electron system; that is, y = m e m p /(m e + m p ), where m p is the mass 
of the proton. However, since the proton mass is much greater than the 
electron mass, y is almost the same as m e and we will neglect the difference 
between the two.) The energies in (1.1) agree with experiment, in that all 
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the observed frequencies in hydrogen are (at least to the precision available 
at the time of Rydberg) of the form 


UJ — (E n E m ) 


( 1 . 2 ) 


for some n > m. It should be noted that Johann Balmer had already 
observed in 1885 frequencies of the same form, but only in the case m = 2, 
and that Balmer’s work influenced Rydberg. 

The frequencies in (1.2) are known as the spectrum of hydrogen. Balmer 
and Rydberg were merely attempting to find a simple formula that would 
match the observed frequencies in hydrogen. Neither of them had a the¬ 
oretical explanation for why only these particular frequencies occur. Such 
an explanation would have to wait until the beginnings of quantum theory 
in the twentieth century. 

1.2.2 The Bohr-de Broglie Model of the Hydrogen Atom 

In 1913, Niels Bohr introduced a model of the hydrogen atom that at¬ 
tempted to explain the observed spectrum of hydrogen. Bohr pictured the 
hydrogen atom as consisting of an electron orbiting a positively charged 
nucleus, in much the same way that a planet orbits the sun. Classically, 
the force exerted on the electron by the proton follows the inverse square 
law of the form 



(1.3) 


where Q is the charge of the electron, in appropriate units. 

If the electron is in a circular orbit, its trajectory in the plane of the 
orbit will take the form 


(x(t),y(t)) = (rcos(wt),rsin(u;t)). 


If we take the second derivative with respect to time to obtain the acceler¬ 
ation vector a, we obtain 


a(f) = (— oj 2 r cos(wf), — w 2 r sin(wf)), 


so that the magnitude of the acceleration vector is w 2 r. Newton’s second 
law, F = ma , then requires that 



v 


so that 



w = 
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From the formula for the frequency, we can calculate that the momentum 
(mass times velocity) has magnitude 


P = 




m e Q 2 
r 


(1.4) 


We can also calculate the angular momentum J, which for a circular orbit 
is just the momentum times the distance from the nucleus, as 

J = \J m e Q 2 r. 


Bohr postulated that the electron obeys classical mechanics, except that 
its angular momentum is “quantized.” Specifically, in Bohr’s model, the 
angular momentum is required to be an integer multiple of h (Planck’s 
constant). Setting J equal to nh yields 


r n 


n 2 h 2 
m e Q 2 


(1.5) 


If one calculates the energy of an orbit with radius r n , one finds (Exercise 3) 
that it agrees precisely with the Rydberg energies in (1.1). Bohr further 
postulated that an electron could move from one allowed state to another, 
emitting a packet of light in the process with frequency given by (1.2). 

Bohr did not explain why the angular momentum of an electron is quan¬ 
tized, nor how it moved from one allowed orbit to another. As such, his 
theory of atomic behavior was clearly not complete; it belongs to the “old 
quantum mechanics” that was superseded by the matrix model of Heisen¬ 
berg and the wave model of Schrodinger. Nevertheless, Bohr’s model was an 
important step in the process of understanding the behavior of atoms, and 
Bohr was awarded the 1922 Nobel Prize in physics for his work. Some rem¬ 
nant of Bohr’s approach survives in modern quantum theory, in the WKB 
approximation (Chap. 15), where the Bohr-Sommerfcld condition gives an 
approximation to the energy levels of a one-dimensional quantum system. 

In 1924, Louis de Broglie reinterpreted Bohr’s condition on the angular 
momentum as a wave condition. The de Broglie hypothesis is that an elec¬ 
tron can be described by a wave, where the spatial frequency k of the wave 
is related to the momentum of the electron by the relation 


p = hk. (1.6) 

Here, “frequency” is defined so that the frequency of the function cos (kx) 
is k. This is “angular” frequency, which differs by a factor of 27 t from the 
cycles-per-unit-distance frequency. Thus, the period associated with a given 
frequency k is 2ir/k. 

In de Broglie’s approach, we are supposed to imagine a wave super¬ 
imposed on the classical trajectory of the electron, with the quantization 






1.2 Is an Electron a Wave or a Particle? 


11 



FIGURE 1.3. The Bohr radii for n = 1 to n = 10, with de Broglie waves super¬ 
imposed for n = 8 and n = 10. 

condition now being that the wave should match up with itself when going 
all the way around the orbit. This condition means that the orbit should 
consist of an integer number of periods of the wave: 

2tt 

27rr = n —. 

k 

Using (1.6) along with the expression (1.4) for p, we obtain 


2-7T r = n2n — = 2Tmh. /--. 

p y m e Q z 

Solving this equation for r gives precisely the Bohr radii in (1-5). 

Thus, de Broglie’s wave hypothesis gives an alternative to Bohr’s quan¬ 
tization of angular momentum as an explanation of the allowed energies of 
hydrogen. Of course, if one accepts de Broglie’s wave hypothesis for elec¬ 
trons, one would expect to see wavelike behavior of electrons not just in the 
hydrogen atom, but in other situations as well, an expectation that would 
soon be fulfilled. Figure 1.3 shows the first 10 Bohr radii. For the 8th and 
10th radii, the de Broglie wave is shown superimposed onto the orbit. 

1.2.3 Electron Diffraction 

In 1925, Clinton Davisson and Lester Germer were studying properties of 
nickel by bombarding a thin film of nickel with low-energy electrons. As a 
result of a problem with their equipment, the nickel was accidentally heated 
to a very high temperature. When the nickel cooled, it formed into large 
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crystalline pieces, rather than the small crystals in the original sample. 
After this recrystallization, Davisson and Germer observed peaks in the 
pattern of electrons reflecting off of the nickel sample that had not been 
present when using the original sample. They were at a loss to explain this 
pattern until, in 1926, Davisson learned of the de Broglie hypothesis and 
suspected that they were observing the wavelike behavior of electrons that 
de Broglie had predicted. 

After this realization, Davisson and Germer began to look systemati¬ 
cally for wavelike peaks in their experiments. Specifically, they attempted 
to show that the pattern of angles at which the electrons reflected matched 
the patterns one sees in x-ray diffraction. After numerous additional mea¬ 
surements, they were able to show a very close correspondence between 
the pattern of electrons and the patterns seen in x-ray diffraction. Since 
x-rays were by this time known to be waves of electromagnetic radiation, 
the Davisson-Germer experiment was a strong confirmation of de Broglie’s 
wave picture of electrons. Davisson and Germer published their results in 
two papers in 1927, and Davisson shared the 1937 Nobel Prize in physics 
with George Paget, who had observed electron diffraction shortly after 
Davisson and Germer. 

1.2.4 The Double-Slit Experiment with Electrons 

Although quantum theory clearly predicts that electrons passing through 
a double slit will experience interference similar to that observed in light, 
it was not until Clauss Jonsson’s work in 1961 that this prediction was 
confirmed experimentally. The main difficulty is the much smaller wave¬ 
length for electrons of reasonable energy than for visible light. Jonsson’s 
electrons, for example, had a de Broglie wavelength of 5 nm, as compared to 
a wavelength of roughly 500nm for visible light (depending on the color). 

In results published in 1989, a team led by Akira Tonomura at Hitachi 
performed a double-slit experiment in which they were able to record the 
results one electron at a time. (Similar but less definitive experiments were 
carried out by Pier Giorgio Merli, GianFranco Missiroli and Giulio Pozzi 
in Bologna in 1974 and published in the American Journal of Physics in 
1976.) In the Hitachi experiment, each electron passes through the slits and 
then strikes a screen, causing a small spot of light to appear. The location of 
this spot is then recorded for each electron, one at a time. The key point is 
that each individual electron strikes the screen at a single point. That is to 
say, individual electrons are not smeared out across the screen in a wavelike 
pattern, but rather behave like point particles, in that the observed location 
of the electron is indeed a point. Each electron, however, strikes the screen 
at a different point, and once a large number of the electrons have struck 
and their locations have been recorded, an interference pattern emerges. 

It is not the variability of the locations of the electrons that is surprising, 
since this could be accounted for by small variations in the way the electrons 
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FIGURE 1.4. Four images from the 1989 experiment at Hitachi showing the 
impact of individual electrons gradually building up to form an interference pat¬ 
tern. Image by Akira Tonomura and Wikimedia Commons user Belsazar. File 
is licensed under the Creative Commons Attribution-Share Alike 3.0 Unported 
license. 

are shot toward the slits. Rather, it is the distinctive interference pattern 
that is surprising, with rapid variations in the pattern of electron strikes 
over short distances, including regions where almost no electron strikes 
occur. (Compare Fig. 1.4 to Fig. 1.2.) Note also that in the experiment, the 
electrons are widely separated, so that there is never more than one electron 
in the apparatus at any one time. Thus, the electrons cannot interfere with 
one another; rather, each electron interferes with itself. Figure 1.4 shows 
results from the Hitachi experiment, with the number of observed electrons 
increasing from about 150 in the first image to 160,000 in the last image. 


1.3 Schrodinger and Heisenberg 

In 1925, Werner Heisenberg proposed a model of quantum mechanics based 
on treating the position and momentum of the particle as, essentially, 
matrices of size ooxoo. Actually, Heisenberg himself was not familiar with 
the theory of matrices, which was not a standard part of the mathematical 
education of physicists at the time. Nevertheless, he had quantities of the 
form Xjk and pjk (where j and k each vary over all integers), which we 
can recognize as matrices, as well as expressions such as Xjipik , which 
we can recognize as a matrix product. After Heisenberg explained his the¬ 
ory to Max Born, Born recognized the connection of Heisenberg’s formulas 
to matrix theory and made the matrix point of view explicit, in a paper 
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coauthored by Born and his assistant, Pascual Jordan. Born, Heisenberg, 
and Jordan then all published a paper together elaborating upon their the¬ 
ory. The papers of Heisenberg, of Born and Jordan, and of Born, Heisen¬ 
berg, and Jordan all appeared in 1925. Heisenberg received the 1932 Nobel 
Prize in physics (actually awarded in 1933) for his work. Born’s exclusion 
from this prize was controversial, and may have been influenced by Jordan’s 
connections with the Nazi party in Germany. (Heisenberg’s own work for 
the Nazis during World War II was also a source of much controversy after 
the war.) In any case, Born was awarded the Nobel Prize in physics in 
1954 for his work on the statistical interpretation of quantum mechanics 
(Sect. 1.4). 

Meanwhile, in 1926, Erwin Schrodinger published four remarkable papers 
in which he proposed a wave theory of quantum mechanics, along the lines 
of the de Broglie hypothesis. In these papers, Schrodinger described how the 
waves evolve over time and showed that the energy levels of, for example, 
the hydrogen atom could be understood as eigenvalues of a certain oper¬ 
ator. (See Chap. 18 for the computation for hydrogen.) Schrodinger also 
showed that the Heisenberg- Born-Jordan matrix model could be incorpo¬ 
rated into the wave theory, thus showing that the matrix theory and the 
wave theory were equivalent (see Sect. 3.8). This book describes the math¬ 
ematical structure of quantum mechanics in essentially the form proposed 
by Schrodinger in 1926. Schrodinger shared the 1933 Nobel Prize in physics 
with Paul Dirac. 


1.4 A Matter of Interpretation 

Although Schrodinger’s 1926 papers gave the correct mathematical descrip¬ 
tion of quantum mechanics (as it is generally accepted today), he did not 
provide a widely accepted interpretation of the theory. That task fell to 
Born, who in a 1926 paper proposed that the “wave function” (as the wave 
appearing in the Schrodinger equation is generally called) should be inter¬ 
preted statistically, that is, as determining the probabilities for observations 
of the system. Over time, Born’s statistical approach developed into the 
Copenhagen interpretation of quantum mechanics. Under this interpreta¬ 
tion, the wave function ip of the system is not directly observable. Rather, 
ip merely determines the probability of observing a particular result. 

In particular, if ip is properly normalized, then the quantity |^>(x)| 2 is 
the probability distribution for the position of the particle. Even if ip itself 
is spread out over a large region in space, any measurement of the position 
of the particle will show that the particle is located at a single point, just 
as we see for the electrons in the two-slit experiment in Fig. 1.4. Thus, a 
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measurement of a particle’s position does not show the particle “smeared 
out” over a large region of space, even if the wave function if is smeared 
out over a large region. 

Consider, for example, how Born’s interpretation of the Schrodinger 
equation would play out in the context of the Hitachi double-slit exper¬ 
iment depicted in Fig. 1.4. Born would say that each electron has a wave 
function that evolves in time according to the Schrodinger equation (an 
equation of wave type). Each particle’s wave function, then, will propa¬ 
gate through the slits in a manner similar to that pictured in Fig. 1.1. If 
there is a screen at the bottom of Fig. 1.1, then the electron will hit the 
screen at a single point, even though the wave function is very spread out. 
The wave function does not determine where the particle hits the screen; it 
merely determines the probabilities for where the particle hits the screen. If 
a whole sequence of electrons passes through the slits, one after the other, 
over time a probability distribution will emerge, determined by the square 
of the magnitude of the wave function, which is shown in Fig. 1.2. Thus, 
the probability distribution of electrons, as seen from a large number of 
electrons as in Fig. 1.4, shows wavelike interference patterns, even though 
each individual electron strikes the screen at a single point. 

It is essential to the theory that the wave function if(x) itself is not the 
probability density for the location of the particle. Rather, the probability 
density is \if(x)\ 2 . The difference is crucial, because probability densities 
are intrinsically positive and thus do not exhibit destructive interference. 
The wave function itself, however, is complex-valued, and the real and 
imaginary parts of the wave function take on both positive and negative 
values, which can interfere constructively or destructively. The part of the 
wave function passing through the first slit, for example, can interfere with 
the part of the wave function passing through the second slit. Only after 
this interference has taken place do we take the magnitude squared of the 
wave function to obtain the probability distribution, which will, therefore, 
show the sorts of peaks and valleys we see in Fig. 1.2. 

Born’s introduction of a probabilistic element into the interpretation of 
quantum mechanics was—and to some extent still is—controversial. Ein¬ 
stein, for example, is often quoted as saying something along the lines of, 
“God does not play at dice with the universe.” Einstein expressed the same 
sentiment in various ways over the years. His earliest known statement to 
this effect was in a letter to Born in December 1926, in which he said, 

Quantum mechanics is certainly imposing. But an inner voice 
tells me that it is not yet the real thing. The theory says a lot, 
but does not really bring us any closer to the secret of the “old 
one.” I, at any rate, am convinced that He does not throw dice. 

Many other physicists and philosophers have questioned the probabilistic 
interpretation of quantum mechanics, and have sought alternatives, such 
as “hidden variable” theories. Nevertheless, the Copenhagen interpretation 
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of quantum mechanics, essentially as proposed by Born in 1926, remains 
the standard one. This book resolutely avoids all controversies surround¬ 
ing the interpretation of quantum mechanics. Chapter 3, for example, 
presents the standard statistical interpretation of the theory without ques¬ 
tion. The book may nevertheless be of use to the more philosophically 
minded reader, in that one must learn something of quantum mechanics 
before delving into the (often highly technical) discussions about its inter¬ 
pretation. 


1.5 Exercises 

1. Beginning with the formula for the sum of a geometric series, use 
differentiation to obtain the identity 


E 

n —0 


ne~ An = 


„-A 


(1 — e~ A ) 2 ' 


2. In Planck’s model of blackbody radiation, the energy in a given fre¬ 
quency w of electromagnetic radiation is distributed randomly over 
all numbers of the form nhw, where n = 0,1,2,.... Specifically, the 
likelihood of finding energy nhu is postulated to be 

p(E = ntuo) = E _/37lR “, 


^ g—/3 hui 

where Z is a normalization constant, which is chosen so that the sum 
over n of the probabilities is 1. Here /3 = 1 /(fc^T), where T is the 
temperature and ks is Boltzmann’s constant. The expected value of 
the energy, denoted ( E ), is defined to be 


oo 

n =0 


(a) Using Exercise 1, show that 


(E) 


hu 

g/3hu) _ 2 ' 


(b) Show that ( E ) behaves like 1//3 = ksT for small w, but that 
(E) decays exponentially as w tends to infinity. 


Note: In applying the above calculation to blackbody radiation, one 
must also take into account the number of modes having frequency 
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in a given range, say between uiq and uiq + e. The exact number of 
such frequencies depends on the shape of the cavity, but according to 
Weyl’s law, this number will be approximately proportional to ecoq for 
large values of loq. Thus, the amount of energy per unit of frequency is 


C 


Hu ; 3 

gfihuj _^ ’ 


(1.7) 


where C is a constant involving the volume of the cavity and the 
speed of light. The relation (1.7) is known as Planck’s law. 

3. In classical mechanics, the kinetic energy of an electron is m e v 2 / 2, 
where v is the magnitude of the velocity. Meanwhile, the potential 
energy associated with the force law (1.3) is V(r ) = — Q 2 /r , since 
dV/dr = F. Show that if the particle is moving in a circular orbit 
with radius r n given by (1.5), then the total energy (kinetic plus 
potential) of the particle is E ni as given in (1.1). 



2 

A First Approach to Classical 
Mechanics 


2.1 Motion in M 1 

2.1.1 Newton’s law 

We begin by considering the motion of a single particle in M 1 , which may 
be thought of as a particle sliding along a wire, or a particle with motion 
that just happens to lie in a line. We let x(t) denote the particle’s position 
as a function of time. The particle’s velocity is then 

v{t) ~ x(t), 

where we use a dot over a symbol to denote the derivative of that quantity 
with respect to the time t. 

The particle’s acceleration is then 

a(t) = v(t) = x(t ), 

where x denotes the second derivative of x with respect to t. We assume 
that there is a force acting on the particle and we assume at first that the 
force F is a function of the particle’s position only. (Later, we will look at 
the case of forces that depend also on velocity.) 

Under these assumptions, Newton’s second law (F = ma ) takes the form 

F(x(t)) = ma = mx(t), (2-1) 

where m is the mass of the particle, which is assumed to be positive. We will 
henceforth abbreviate Newton’s second law as simply “Newton’s law,” since 
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we will use the second law much more frequently than the others. Since 
(2.1) is of second order, the appropriate initial conditions (needed to get 
a unique solution) are the position and velocity at some initial time to- So 
we look for solutions of (2.1) subject to 

x(t 0 ) = x 0 
a; (to) = v 0 . 

Assuming that F is a smooth function, standard results from the ele¬ 
mentary theory of differential equations tell us that there exists a unique 
local solution to (2.1) for each pair of initial conditions. (A local solution 
is one defined for t in a neighborhood of the initial time to-) Since (2.1) is 
in general a nonlinear equation, one cannot expect that, for a general force 
function F, the solutions will exist for all t. If, for example, F(x) = x 2 , then 
any solution with positive initial position and positive initial velocity will 
escape to infinity in finite time. (Apply Exercise 4 with V(x) = — x 3 /3.) 
For a proof existence and uniqueness, see Example 8.2 and Theorem 8.13 
in [28], 

Definition 2.1 A solution x(t) to Newton’s law is called a trajectory. 

Example 2.2 (Harmonic Oscillator) If the force is given by Hooke’s 
law, F(x) = —kx, where k is a positive constant, then Newton’s law can be 
written as mx + kx = 0. The general solution of this equation is 

x(t) = acos(wf) + bsm(cot), 
where u> := yjkjm is the frequency of oscillation. 

The system in Example 2.2 is referred to as a (classical) harmonic os¬ 
cillator. This system can describe a mass on a spring, where the force is 
proportional to the distance x that the spring is stretched from its equi¬ 
librium position. The minus sign in — kx indicates that the force pulls the 
oscillator back toward equilibrium. Here and elsewhere in the book, we 
use the “angular” notion of frequency, which is the rate of change of the 
argument of a sine or cosine function. If ui is the angular frequency, then 
the “ordinary” frequency- i.e., the number of cycles per unit of time—is 
u}/2n. Saying that x has (angular) frequency u> means that x is periodic 
with period 2i r/w. 

2.1.2 Conservation of Energy 

We return now to the case of a general force function F(x). We define 
the kinetic energy of the system to be )rm/ 2 . We also define the potential 
energy of the system as the function 

V(x) = -j 


F{x) dx, 


( 2 . 2 ) 
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so that F(x) = —dV/dx. (The potential energy is defined only up to adding 
a constant.) The total energy E of the system is then 

E(x,v) = -mv 2 + V(x). (2.3) 

The chief significance of the energy function is that it is conserved , meaning 
that its value along any trajectory is constant. 

Theorem 2.3 Suppose a particle satisfies Newtons law in the form mx = 
F(x). Let V and E be as in (2.2) and (2.3). Then the energy E is conserved, 
meaning that for each solution x{t) of Newton’s law, E(x(t),x(t)) is inde¬ 
pendent of t. 

Proof. We verify this by differentiation, using the chain rule: 

d -E{x(t),x(t)) = Qm(x(*)) 2 + V(x(t))^J 

• / \.. / N dV . 

= mx(t)x(t) H——i(f) 
dx 

= x(t)[mx(t) — F(x(t))]. 

This last expression is zero by Newton’s law. Thus, the time-derivative of 
the energy along any trajectory is zero, so E{x{t), x(t)) is independent of 
t, as claimed. ■ 

We may call the energy a conserved quantity (or constant of motion ), 
since the particle neither gains nor loses energy as the particle moves 
according to Newton’s law. 

Let us see how conservation of energy helps us understand the solution 
to Newton’s law. We may reduce the second-order equation mx = F(x) to 
a pair of first-order equations, simply by introducing the velocity v as a 
new variable. That is, we look for pairs of functions (x(t), v(t)) that satisfy 
the following system of equations 

dx 

di = v(t) 

§ = W(*(0). (2-4) 

dt m 

If (x(t),v(t)) is a solution to this system, then we can immediately see that 
x(t) satisfies Newton’s law, just by substituting dx/dt for v in the second 
equation. We refer to the set of possible pairs of the form (x,v) (i.e., R 2 ) 
as the phase space of the particle in R 1 . The appropriate initial conditions 
for this first-order system are x(0) = xq and w(0) = vq. 

Once we are working in phase space, we can use the conservation of 
energy to help us. Conservation of energy means that each solution to 
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the system (2.4) must lie entirely on a single “level curve” of the energy 
function, that is, the set 


{ {x,v) £ R 2 | E(x,v) = E(x 0 ,v 0 )} . 


(2.5) 


If F —and therefore also V —is smooth, then E is a smooth function of x 
and v. Then as long as (2.5) contains no critical points of E, this set will 
be a smooth curve in R 2 , by the implicit function theorem. If the level set 
(2.5) is also a simple closed curve, then the solutions of (2.5) will simply 
wind around and around this curve. Thus, the set that the solutions to (2.5) 
trace out in phase space can be determined simply from the conservation 
of energy. The only thing not apparent at the moment is how this curve is 
parameterized as a function of time. 

In mechanics, a conserved quantity- such as the energy in the one¬ 
dimensional version of Newton’s law—is often referred to as an “integral 
of motion.” The reason for this is that although Newton’s second law is a 
second-order equation in x, the energy depends only on x and x and not 
on x. Thus, the equation 


^(x(t)f+V(x(t)) = E 0 


where Eq is the value of the energy at time to, is actually a first-order 
differential equation. We can solve for x to put this equation into a more 
standard form: 



( 2 . 6 ) 


What this means is that by using conservation of energy we have turned the 
original second-order equation into a first-order equation. We have therefore 
“integrated” the original equation once, that is, changed an equation of 
the form x(t) = ■ ■ ■ into an equation of the form x(t) = ■ ■ ■ . The first- 
order equation ( 2 . 6 ) is separable and can be solved more-or-less explicitly 
(Exercise 1). 

2.1.3 Systems with Damping 

Up to now, we have considered forces that depend only on position. It is 
common, however, to consider forces that depend on the velocity as well 
as the position. In the case of a damped harmonic oscillator, for example, 
one typically assumes that there is, in addition to the force of the spring, 
a damping force (friction, say) that is proportional to the velocity. Thus, 
F = —kx — 'yx, where k is, as before, the spring constant and where 7 > 0 
is the damping constant. The minus sign in front of 7 x reflects that the 
damping force operates in the opposite direction to the velocity, causing 
the particle to slow down. The equation of motion for such a system is then 


mx + 72 + kx = 0 . 
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If 7 is small, the solutions to this equation display decaying oscillation, 
meaning sines and cosines multiplied by a decaying exponential; if 7 is 
large, the solutions are pure decaying exponentials (Exercise 5). 

In the case of the damped harmonic oscillator, there is no longer a 
conserved energy. Specifically, there is no nonconstant continuous func¬ 
tion E on R 2 such that E(x(t),x(t)) is independent of t for all solutions of 
Newton’s law. To see this, we simply observe that for 7 > 0, all solutions 
x(t) have the property that (x(t), x(t)) tends to the origin in the plane as t 
tends to infinity. Thus, if E is continuous and constant along each trajec¬ 
tory, the value of E at the starting point has to be the same as the value 
at the origin. 

We now consider a general system with damping. 

Proposition 2.4 Suppose a particle moves in the presence of a force law 
given by F(x,x) = Fi(x) — jx, with 7 > 0. Define the energy E of the 
system by 

E(x, x) = -mi 2 + V( x )> 

where dV/dx = —F\{x). Then along any trajectory x(t), we have 
^E{x(t),x(t)) = - 7 x(t) 2 < 0. 

Thus, although the energy is not conserved, it is decreasing with time, 
which gives us some information about the behavior of the system. 

Proof. We differentiate as in the proof of Theorem 2.3, except that now 
dV/dx = —Fi(x): 

^E(x(t),x(t)) = x(t)[mx(t) - Fi(x(t))]. 

Since Pi is not the full force function, the quantity in square brackets equals 
not zero but —'yx. Thus, dE/dt = —'yx 2 . m 

We can interpret Proposition 2.4 as saying that in the presence of friction, 
the system we are studying gives up some of its energy to heat energy in 
the environment, so that the energy of our system decreases with time. 
We will see that in higher dimensions, it is possible to have conservation 
of energy in the presence of velocity-dependent forces, provided that these 
forces act perpendicularly to the velocity. 


2.2 Motion in W n 

We now consider a particle moving in R". The position x = ( 27 ,... ,x n ) 
of a particle is now a vector in R re , as is the velocity v and acceleration a. 
We let 


x = ( 27 , ...,x n ) 
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denote the derivative of x with respect to t and we let x denote the second 
derivative of x with respect to t. Newton’s law now takes the form 

mx(t) = F(x(i),x(t)), (2.7) 

where F : 1" x M" —»• R" is some force law, which in general may depend 
on both the position and velocity of the particle. 

We begin by considering forces that are independent of velocity, and we 
look for a conserved energy function in this setting. 

Proposition 2.5 Consider Newton’s law (2.7) in the case of a velocity- 
independent force: mx(t) = F(x(t)). Then an energy function of the form 

£(x,x) = |m|x| 2 + P(x) 

is conserved if and only if V satisfies 

—VV = F, 


where W is the gradient of V. 

Saying that E is “conserved” means that E(x(t),i(t)) is independent of 
t for each solution x(t) of Newton’s law. The function P is the potential 
energy of the system. 

Proof. Differentiating gives 

n U dV 

+^2g^x j {t) 
j=i 3 =1 3 

x(t) • [rox(t) + VP] 

*(*) • [F(x) + VP(x)] 

Thus, dE/dt will always be equal to zero if and only if we have 

-W(x) = F(x) 


|(im|x(t)| 2 +P(x(t))) = 


for all x. ■ 

We now encounter something that did not occur in the one-dimensional 
case. In M 1 , any smooth function can be expressed as the derivative of some 
other function. In M ra , however, not every vector-valued function F(x) can 
be expressed as the (negative of) the gradient of some scalar-valued function 
V. 

Definition 2.6 Suppose F is a smooth, W 1 -valued function on a domain 
U C M”. Then F is called conservative if there exists a smooth, real-valued 
function V on U such that F = —VP. 

If the domain U is simply connected, then there is a simple local condition 
that characterizes conservative functions. 


2.2 Motion in M n 


25 


Proposition 2.7 Suppose U is a simply connected domain in R” and F 
is a smooth, R" -valued function on U. Then F is conservative if and only 
if F satisfies 

SHt- 

at each point in U. 

When n = 3, it is easy to check that the condition (2.8) is equivalent 
to the curl V x F of F being zero on U. The hypothesis that U be simply 
connected cannot be omitted; see Exercise 7. 

Proof. If F is conservative, then 

dFj d 2 V d 2 V dF k 

dxk dxkdxj dxjdxk dxj 

at every point in U. In the other direction, if F satisfies (2.8), V can be 
obtained by integrating F along paths and using the Stokes theorem to 
establish independence of choice of path. See, for example, Theorem 4.3 on 
p. 549 of [44] for a proof in the n = 3 case. The proof in higher dimensions 
is the same, provided one knows the general version of the Stokes theorem. 
■ 

We may also consider velocity-dependent forces. If, for example, F(x, v) 
= —yv + Fi(x), where 7 is a positive constant, then we will again have 
energy that is decreasing with time. There is another new phenomenon, 
however, in dimension greater than 1 , namely the possibility of having a 
conserved energy even when the force depends on velocity. 

Proposition 2.8 Suppose a particle in R ra moves in the presence of a force 
F of the form 

F(x,v) =— W(x) + F 2 (x,v), 
where V is a smooth function and where F 2 satisfies 

v • F 2 (x, v) = 0 (2.9) 

for all x and v in 1". Then the energy function E{x, v) = |v| + V (x) 

is constant along each trajectory. 

If, for example, F 2 is the force exerted on a charged particle in R 3 by a 
magnetic field B(x), then 


F 2 (x,v) =qv x B(x), 

where q is the charge of the particle, which clearly satisfies (2.9). 
Proof. See Exercise 8 . ■ 
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2.3 Systems of Particles 

If we have a system if N particles, each moving in R n , then we denote the 
position of the jth particle by 

Thus, in the expression x? k , the superscript j indicates the jth particle, while 
the subscript k indicates the fcth component. Newton’s law then takes the 
form 

rrijH 3 = F- 7 (x 1 ,...,x Ar ,x 1 ,...,x w ), j = 1,2, 

where mj is the mass of the jth particle. Here, F- 7 is the force on the jth 
particle, which in general will depend on the position and velocity not only 
of that particle, but also on the position and velocity of the other particles. 

2.3.1 Conservation of Energy 

In a system of particles, we cannot expect that the energy of each individ¬ 
ual particle will be conserved, because as the particles interact, they can 
exchange energy. Rather, we should expect that, under suitable assump¬ 
tions on the forces F- 7 , we can define a conserved energy function for the 
whole system (the total energy of the system). 

Let us consider forces depending only on the position of the particles, 
and let us assume that the energy function will be of the form 

f;(x 1 ,...,x jv ,v 1 ,...,v jv ) = l vJ | 2 + ^( x \---, xiV )- ( 2 - 10 ) 

i=1 

We will now try to see what form for V (if any) will allow E to be constant 
along each trajectory. 

Proposition 2.9 An energy function of the form (2.10) is constant along 
each trajectory if 

V j V = -F j (2.11) 

for each j, where V 7 is the gradient with respect to the variable x- 7 . 

Proof. We compute that 

N 

= [ m jl2 • x- 7 + V 3 V ■ x 7 ] 
j=i 
N 

= x J • [toj-x- 7 +V J b] 

3 =1 
N 

= v] ■ 

3=1 


dE 

dt 
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If V J V = —F J , then E will be conserved. ■ 

As in the one-particle case, there is a simple condition for the existence 
of a potential function V satisfying (2.11). 

Proposition 2.10 Suppose a force function F = (F 1 ,..., F^) is defined 
on a simply connected domain U in R nAr . Then there exists a smooth 
function V on U satisfying 

V j V = — F J 


for all j if and only if we have 


dFj dF l m 
dx l m dx[ 

for all j , k, l , and m. 


( 2 . 12 ) 


Proof. Apply Proposition 2.7 with n replaced by nN and with j and k 
replaced by the pairs (j, k) and ( l,m ). ■ 


2.3.2 Conservation of Momentum 

We now introduce the notion of the momentum of a particle. 

Definition 2.11 In an N-particle system, the momentum of the jth 
particle, denoted p J , is the product of the mass and the velocity of that 
particle: 

p ' = mjx 3 . 

The total momentum of the system, denoted p, is defined as 

N 

P = ]Tp J . 


Observe that 


dp J 

dt 


= mj%?= FA 


Thus, Newton’s law may be reformulated as saying, “The force is the rate 
of change of the momentum.” This is how Newton originally formulated 
his second law. 

Newton’s third law says, “For every action, there is an equal and opposite 
reaction.” This law will apply if all forces are of the “two-particle” variety 
and satisfy a natural symmetry property. Having two-particle forces means 
that the force F J on the jth particle is a sum of terms F Jl , j k , where 
F J,fe depends only x J and x fc . The relevant symmetry property is that 
F j A(x'A x fe ) = — F fc,J (x A ’, x J ): that is, the force exerted by the jth particle 
on the fcth particle is the negative (i.e., “equal and opposite”) of the force 
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exerted by the fcth particle on the jth particle. If the forces are assumed 
also to be conservative, then the potential energy of the system will be of 
the form 

F(x 1 ,x 2 ,...,x jv ) = ^2v j ’ k (x j -x k ). (2.13) 

j<k 

One important consequence of Newton’s third law is conservation of the 
total momentum of the system. 

Proposition 2.12 Suppose that for each j. the force on the jth particle is 
of the form 

F(x 1 ,x 2 1 ..,x Af ) = ^F' fc (x^x' ; ) ) 

k^3 

for certain functions F J,fc . Suppose also that we have the “equal and 
opposite” condition 

F J,fc (x- J , x fe ) = —F fcj (x- J , x fe ). 


Then the total momentum of the system is conserved. 

Note that since the rate of change of p J is F J , the force on the jth 
particle, the momentum of each individual particle is not constant in time, 
except in the trivial case of a noninteracting system (one in which all forces 
are zero). 

Proof. Differentiating gives 


dp 

dt 


N 


E 


dp J 

dt 


N 


T. F ’ 


EE f*v,x‘). 

3 


By the equal and opposite condition, F J,fc (x J , x fc ) cancels with F fcj (x J , x fc ), 
so dp/dt = 0. ■ 

Let us consider, now, a more general situation in which we have con¬ 
servative forces, but not necessarily of the “two-particle” form. It is still 
possible to have conservation of momentum, as the following result shows. 

Proposition 2.13 If a multiparticle system has a force law coming from 
a potential V, then the total momentum of the system is conserved if and 
only if 

^(x 1 + a,x 2 + a,... ,x w + a) = ^(x^x 2 ,... ,x w ) (2.14) 
for all a £ R". 

Proof. Apply (2.14) with a = te^, where ej, is the vector with a 1 in the 
fcth spot and zeros elsewhere. Differentiating with respect to t at t = 0 
gives 

n = _ _ 'sp F i _ _ d fPk L _ d Pk 

h d < h k h dt dt ' 
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where pk is the /cth component of the total momentum p. Thus, if (2.14) 
holds, p is constant in time. 

Conversely, if the momentum is conserved, then the sum of the forces is 
zero at every point, and so 

-y-V(x 1 + fa, x 2 +ta, ..., x N + fa) 
dt 

N 

= ^2 V J V(x 1 + ta, x 2 + fa,..., x N + ta) • a 
j=i 

= — ^(x 1 + ta, x 2 + ta,..., x N + ta)^ ■ a 

= 0 

for all f. Thus, the value of the quantity being differentiated is the same at 
t — 0 as at f = 1, which establishes (2.14). ■ 

The moral of the story is that conservation of momentum is a consequence 
of translation-invariance of the system, where “translation invariance ” 
means invariance under simultaneous translations of every particle by the 
same amount. (See Exercise 11 for a more general version of this result.) 
If the potential is of the “two-particle” form (2.13), then it is evident that 
the condition (2.14) is satisfied. 


2.3.3 Center of Mass 

We now consider an important application of momentum conservation. 

Definition 2.14 For a system of N particles moving in R", the center 
of mass of the system at a fixed time is the vector c G R n given by 


c = 


N 

E m,- 
— 

3 = 1 


M' 


where M = 171 j * s the total mass of the system. 

The center of mass is a weighted average of the positions of the various 
particles. Differentiating c(t) with respect to t gives 


dc 1 . ■ p 

dt = M = M’ 

o =i 


(2.15) 


where p is the total momentum. 
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Proposition 2.15 Suppose the total momentum p of a system is conserved. 
Then the center of mass moves in a straight line at constant speed. 
Specifically, 

C (t) = C(t 0 ) + (t-t 0 )jj, 

where c(to) is the center of mass at some initial time to- 

Proof. The result follows easily from (2.15). ■ 

The notion of center of mass is particularly useful in a system of two 
particles in which momentum is conserved. For a system of two particles, if 
the potential energy ^/(x^x 2 ) is invariant under simultaneous translations 
of x 1 and x 2 , then it is of the form 

P(x 1 ,x 2 ) = P(x 1 -x 2 ), 


where P(a) = P(a, 0). 

Now, the positions x 1 , x 2 of the particles can be recovered from knowledge 
of the center of mass and the relative position 


as follows: 


r c + m 2 y 

x = - 

mi + m2 

2 c-miy 

x = -. 

mi + m 2 


Meanwhile, we may compute that 


m 


= x 1 - x 2 


—— VP(x 1 - x 2 ) - —VP(x 1 - x 2 ). 
mi m 2 


This calculation gives the following result. 


Proposition 2.16 For a two-particle system with potential energy of the 
form V^x-^x 2 ) = V^x 1 — x 2 ), the relative position y := x 1 — x 2 satisfies 
the differential equation 

M y = -VV-(y), 

where p is the reduced mass given hy 


M = 



mim 2 
mi + m 2 


Thus, when the total momentum of a two-particle system is conserved, 
the relative position evolves as a one-particle system with “effective” mass p, 
while the center of mass moves “trivially,” as described in Proposition 2.15. 
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FIGURE 2.1. A(t) is the area of the shaded region. 


2.4 Angular Momentum 

We start by considering angular momentum in the simplest nontrivial case, 
motion in R 2 . 

Definition 2.17 Consider a particle moving in R , having position x, 
velocity v, and momentum p = mv. Then the angular momentum of 
the particle, denoted J , is given by 


J = XlP2—X2Pl- (2-16) 

In more geometric terms, J = |x| |p| sin <j>, where <j) is the angle (measured 
counterclockwise) between x and p. We can look at J in yet another way 
as follows. If 9 is the usual angle in polar coordinates on R 2 , then an 
elementary calculation (Exercise 9) shows that 

J = mr 2< —. (2-17) 

dt 

It then follows that 

d A 

J = 2m-—, (2.18) 

dt 


where A = (1/2) Jr 2 dO is the area being swept out by the curve x(t). 
See Fig. 2.1. 

One significant property of the angular momentum is that it (like the 
energy) is conserved in certain situations. 


Proposition 2.18 Suppose a particle of mass m is moving in R 2 under 
the influence of a conservative force with the potential function V (x) . If 
V is invariant under rotations in R 2 , then the angular momentum J = 
X 1 P 2 —X 2 P 1 is independent of time along any solution of Newton’s equation. 
Conversely, if J is independent of time along every solution of Newton’s 
equation, then V is invariant under rotations. 
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Proof. Differentiating (2.16) along a solution of Newton’s law gives 
dJ dx i dp2 dx 2 dpi 

ti = ^ P 2 +Xl ^-^r Pl ^ X2 ^r 

1 dV 1 dV 

= —PlP2 ~ Xl a -P2Pl + X 2 w- 

m ox 2 m ox 1 


0V 


dx 2 


= * 2 : 


Zl- 


'cLq ~ dx 2 

On the other hand, consider rotations Re in 


given by 


Re — 


cos 9 — sin 9 

sin 9 cos 9 


If we differentiate V along this family of rotations, we obtain 




6=0 


dV dx dV dy _ dV dV _ dJ 
dx d9 dy d9 X2 dx 1 Xl dx 2 dt 


Thus, the angular derivative of V is zero if and only if J is constant. ■ 
Conservation of J [together with the relation (2.18)] gives the following 
result. 


Corollary 2.19 (Kepler’s Second Law) Suppose a particle is moving 
in R 2 in the presence of a force associated with a rotationally invariant 
potential. Ifx(t) is the trajectory of the particle, then the area swept out by 
x(t) between times t = a and t = b is (b—a)J/(2m), where J is the constant 
value of the angular momentum along the trajectory. Since the area swept 
out depends only on b — a, we may say that “equal areas are swept out in 
equal times. ” 

Kepler, of course, was interested in the motion of planets in M 3 , not in 
M 2 . The motion of a planet moving in the “inverse square” force of a sun 
will, however, always lie in a plane. (This claim follows from the three- 
dimensional version of conservation of angular momentum, as explained in 
Sect. 2.6.1.) 

In R , the angular momentum of the particle is a vector, given by 

J = x x p, (2-19) 

where x denotes the cross product (or vector product). Thus, for example, 

J 3 = X 1 P 2 - x 2 pi- (2.20) 

If, then, we have a particle in R 3 that just happens to be moving in R 2 
(i.e., x 3 = 0 and p 3 = 0), then the angular momentum will be in the , 2 - 
direction with ^-component given by the quantity J defined in 
Definition 2.17. 
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The representation of the angular momentum of a particle in R 3 as a 
vector is a low-dimensional peculiarity. For a particle in R", the angular 
momentum is a skew-symmetric matrix given by 

Jjk = •EjPh x kPj- (2.21) 

In the R 3 case, the entries of the 3x3 angular momentum matrix are made 
up by the three components of the angular momentum vector together with 
their negatives, with zeros along the diagonal. [Compare, e.g., (2.20) and 
( 2 . 21 ).] 

Definition 2.20 For a system of N particles moving in R”, the total 
angular momentum of the system is the skew-symmetric matrix J given 
by 

N 

Jjk=T / ( xl A~ x kP l j )- ( 2 - 22 ) 

i=i 

Theorem 2.21 Suppose a system of N particles in R" is moving under 
the influence of conservative forces with potential function V. If V satisfies 

V^Rx 1 , Rx 2 ,..., Rx. n ) = y(x 1 ,x 2 ,...,x JV ) (2.23) 

for every rotation matrix R , then the total angular momentum of the system 
is conserved (constant along each trajectory). Conversely, if the total an¬ 
gular momentum is constant along each trajectory, then V satisfies (2.23). 

The proof of this result is similar to that of Proposition 2.18 and is left 
as an exercise (Exercise 10). We will re-examine the concept of angular 
momentum in the next section using the language of Poisson brackets and 
Hamiltonian flows. 


2.5 Poisson Brackets and Hamiltonian Mechanics 

We consider now the Hamiltonian approach to classical mechanics. (There 
is also the Lagrangian approach, but that approach is not as relevant for 
our purposes.) The Hamiltonian approach, and in particular the Poisson 
bracket, will help us to understand the general phenomenon of conserved 
quantities. The Poisson bracket is also an important source of motivation 
for the use of commutators in quantum mechanics. 

In the Hamiltonian approach to mechanics, we think of the energy func¬ 
tion as a function of position and momentum, rather than position and 
velocity, and we refer to it as the “Hamiltonian.” If a particle in R" has 
the usual sort of energy function (kinetic energy plus potential energy), we 
have 

1 n 

^ (X ’ P)= + 


(2.24) 
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Here, as usual, pj = m.jXj. We now observe that Newton’s law can be 
expressed in the following form: 


dxj dH 
dt dpj 
dpj dH 

dt dxj 


(2.25) 


After all, with H of the indicated form, these equations read dxj/dt = 
Pj /to, which is just the definition of pj, and dpj /dt = —dV/dxj = Fj, which 
is just Newton’s law, in the form originally given by Newton. We refer to 
Newton’s law, in the form (2.25) as Hamilton’s equations. 

Although it is not obvious at the moment that we have gained anything 
by writing Newton’s law in the form (2.25), let us proceed on a bit further 
and see. Our next step is to introduce the Poisson bracket. 


Definition 2.22 Let f and g be two smooth functions on R 2 ", where an 
element of R 2n is thought of as a pair (x, p), with x £ R" representing the 
position of a particle and p £ R" representing the momentum of a particle. 
Then the Poisson bracket of f and g, denoted {/, g }, is the function on 
R 2 " given by 




dldg_\ 

d Pj dxj ) 


The Poisson bracket has the following properties. 

Proposition 2.23 For all smooth functions /, g , and h on R 2rl we have 
the following: 

1 • {/, g + ch} = {/, g} + c{f, h} for all c £ R 

2- {g,f} = ~{f,g} 

3- {f,gh} = {f,g}h + g{f,h} 

4■ {f,{g,h}} = {{f,g},h} + {g,{f,h}} 

Properties 1 and 2 of Proposition 2.23 say that the Poisson bracket is 
bilinear and skew-symmetric. Property 3 says that the operation of “bracket 
with /” satisfies the derivation property (similar to the product rule for 
derivatives) with respect to pointwise multiplication of functions, while 
Property 4 says that “bracket with /” satisfies the derivation property 
with respect to the Poisson bracket itself. Property 4 is equivalent to the 
Jacobi identity: 


{/, { g, h}} + {h, {f, g}} + {g, {h, /}} = 0, 


(2.26) 
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as may easily be seen using the skew-symmetry of the Poisson bracket. 
The Jacobi identity, along with bilinearity and skew-symmetry, means that 
the space of C°° functions on M 2 ” forms a Lie algebra under the operation 
of a Poisson bracket. (See Chap. 16.) 

Proof. The first two properties of the Poisson bracket are obvious and the 
third is an easy consequence of the product rule. Let us think about what 
goes into proving Property 4 by direct computation. (An alternative proof 
is given in Exercise 15.) We compute that 


df d 

/ dg dh 

dg dh \ 

dxj dpj 

\dxj dpj 

d Pj dx j J 

df d 

( dg dh 

dg dh\ 

dpj dxj 

Kdxj dpj 

dpj dxj J 


{/, {9,h}} = E 

l=i 

n 

-E 

j=i 

Just the first term in the expression for {/, { g , /i}} generates the following 
four terms (all summed over j) after we use the product rule: 

df d 2 g dh df dg d 2 h df d 2 g dh df dg d 2 h 

dxj dxjdpj dpj dxj dxj dp 2 dxj dp 2 dxj dadpj dxjdpj 


-. (2.27) 


We see, then, that the left-hand side of (2.26) will have a total of 24 terms, 
each summed over j. Each term will have a single derivative on two of the 
three functions, and two derivatives on the third function. There are three 
possibilities for which function gets two derivatives. Once that function is 
chosen, there are four possibilities for which derivatives go on the other 
two functions, with the function that gets two derivatives getting whatever 
derivatives remain (for a total of two cc-derivatives and two p-derivatives). 
That makes 12 possible terms. It is a tedious but straightforward exercise 
to check that each of these 12 possible terms occurs twice in the left-hand 
side of (2.26), with opposite signs. To check just one case explicitly, in 
computing {h, {/, < 7 }}, we will get a term like the second term in (2.27), 
but with ( f,g,h ) replaced by ( h,f,g ): 


dh df d 2 g 
dxj dxj dp 2 

This term (in the computation of {h, {/, 3 }}) cancels with the third term 
in (2.27) (in the computation of {/, {p, h}}). m 

The following elementary result will provide a helpful analogy to the 
“canonical commutation relations” in quantum mechanics. 


Proposition 2.24 The position and momentum functions satisfy the fol¬ 
lowing Poisson bracket relations: 

{xj,x k } = 0 
{Pj,Pk} = 0 

{Xj^Pk} — ^jk- 
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Proof. Direct calculation. ■ 

One of the main reasons for considering the Poisson bracket is the 
following simple result. 

Proposition 2.25 If (x(t),p(t)) is a solution to Hamilton’s equation 
(2.25), then for any smooth function f on R 2ra we have 

-/(x(t).p(t)) = {f,H} (x(f),p(t)). 

We generally write Proposition 2.25 in a more concise form as 

f = 

where the time derivative is understood as being along some trajectory. 
Proof. Using the chain rule and Hamilton’s equations, we have 

df_ = / df^dxj_ df dp A 

dt \dxj dt dpj dt ) 

= " /a/an 9/ / 

f~i \dx :j d Pl dpj V dxj)) 

= {f,H }, 

as claimed. ■ 

Observe that Proposition 2.25 includes Hamilton’s equations themselves 
as special cases, by taking f(x,p) = Xj and by taking f{x,p) = pj. Thus, 
this proposition gives a more coordinate-independent way of expressing the 
time-evolution. 

Corollary 2.26 Call a smooth function f on K 2ra a conserved quantity if 
/(x(f),p(£)) is independent oft for each solution (x(f),p(£)) of Hamilton’s 
equations. Then f is a conserved quantity if and only if 

{f,H} = 0. 

In particular, the Hamiltonian H is a conserved quantity. 

Conserved quantities are also called constants of motion. See Conclusion 
2.31 for another perspective on this result. Conserved quantities (when one 
can find them) are useful in that we know that trajectories must lie in 
the level surfaces of any conserved quantity. Suppose, for example, that 
we have a particle moving in R 2 and that the Hamiltonian H and one 
other independent function / (such as, say, the angular momentum) are 
conserved quantities. Then, rather than looking for trajectories in the four¬ 
dimensional phase space, we look for them inside the joint level sets of H 
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and / (sets of the form H(x,p) = a, f{x,p) = b, for some constants a 
and b). These joint level sets are (generically) two-dimensional instead of 
four-dimensional, so using the constants of motion greatly simplifies the 
problem—from an equation in four variables to one in only two variables. 

Solving Hamilton’s equations on R 2n gives rise to a flow on R 2n , that is, a 
family <f> t of diffeomorphisms of R 2 ”, where $t(x, p) is equal to the solution 
at time t of Hamilton’s equations with initial condition (x, p) . Since it is 
possible (depending on the choice of potential function V) that a particle 
can escape to infinity in finite time, the maps <f> t are not necessarily defined 
on all of R 2ra , but only on some open subset thereof. If <f> t does happen to 
be defined on all of R 2ra (for all t), then we say that the flow is complete. 

Theorem 2.27 (Liouville’s Theorem) The flow associated with Hamil¬ 
ton’s equations, for an arbitrary Hamiltonian function H , preserves the 
(2n)~ dimensional volume measure 

dx\dx2 ■■ ■ dx n dp\dp2 ■■■ dp n . 


What this means, more precisely, is that if a measurable set E is con¬ 
tained in the domain of <I> t for some t £ R, then the volume of <1> t (2?) is 
equal to the volume of E. 

Proof. Hamilton’s equations may be written as 




r dH 5 

Xi 


dpi 

Xn 


dH 

= 


Pi 


dxi 

. Pn 


dH 
dx n _ 


(2.28) 


This means that Hamilton’s Equations describe the flow along the vector 
held on R 2ra appearing on the right-hand side of (2.28). By a standard result 
from vector calculus (see, e.g., Proposition 16.33 in [29]), this how will be 
volume-preserving if and only if the divergence of the vector held is zero. 
We compute this divergence as 

d dH d dH d dH d dH 

dxi dpi + + dx n dp n dpi dxi dp n dx n ' 2 ' 29 


Since 


d 2 H _ d 2 H 
dxjdpj dpjdxj ’ 


the divergence is zero. ■ 

The existence of an invariant volume has important consequences for 
the dynamics of a system. For example, for “confined” systems, an invari¬ 
ant volume implies that the system exhibits “recurrence,” which means 










38 


2. A First Approach to Classical Mechanics 


(roughly) that for most initial conditions, the particle will eventually come 
back arbitrarily close to its initial state in phase space. We will not, how¬ 
ever, delve into this aspect of the theory. 

Note that the divergence of Xh , computed in (2.29), vanishes in a very 
particular way, namely the sum of the jth and (n + j)th terms vanishes 
for all 1 < j < n. This stronger condition turns out to be equivalent to 
the condition that the Hamiltonian flow <f> t associated with an arbitrary 
smooth function on R 2n preserves the symplectic form ui, defined by 

w((x, p), (x', p')) = x ■ p' - p • x'. 

What this means, more precisely, is that for any t £ R and any (x. p) £ R 2n , 
the matrix of partial derivatives of <l> t at the point (x,p) —thought of as a 
linear map of R 2rl to R 2rl —preserves w. This property of <f> t , as it turns out, 
is equivalent to the property that <h t preserves Poisson brackets, meaning 
that 

{/°$t,5°$t} = {f,g}°^t 

for all f,g £ C°°(R"). A map 'P : R 2n —» R 2rl that preserves u> is called 
a symplectomorphism (in mathematics notation) or a canonical transfor¬ 
mation (in physics notation). We defer the proofs of these claims until 
Chap. 21, where we can consider them in a more general setting. 

Definition 2.28 For any smooth function f on R 2 ", the Hamiltonian 
flow generated by f is the flow obtained by solving Hamilton’s equation (2.25) 
with the Hamiltonian H replaced by f. The function f is called the Hamil¬ 
tonian generator of the associated flow. 

Although any smooth function on R 2n can be inserted into Hamilton’s 
equations to produce a flow, physically one should think that there is a 
distinguished function, the Hamiltonian H of the system, such that the 
flow generated by H is the time-evolution of the system. For any other 
function /, the Hamiltonian flow generated by / should not be thought 
of as time-evolution, but as some other flow, which might, for example, 
represent some family of symmetries of our system. 

Proposition 2.29 The Hamiltonian flow generated by the function 

/ a ( x , p) := a • p (2.30) 

is given by 

x(f) = xo + ta 

p(f)=Po, (2.31) 

and the Hamiltonian flow generated by the function 


,9b(x,p) := b • x 


(2.32) 
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is given by 


x(t) = x 0 
P (t) = Po - th. 


Proof. Direct calculation. ■ 

What this means is that the Hamiltonian flow generated by a linear 
combination of the momentum functions consists of translations in position 
of the particle. That is to say, in the flow (2.31) generated by the function 
/ a in (2.30), the particle’s initial position Xo is translated by fa while the 
particle’s momentum is independent of t. Similarly, the Hamiltonian flow 
generated by a linear combination of the position functions [the function 
gb in (2.32)] consists of translations in the particle’s momentum. 

Proposition 2.30 For a particle moving in R 2 , the Hamiltonian flow gen¬ 
erated by the angular momentum function 

J(x, p) =xip 2 -x 2 p\ 

consists of simultaneous rotations of x and p. That is to say, 


Xi(t) 


cos t 

— sinf 

aq(0) 

_ X2 (t) 


sinf 

cost 

_ x 2 (0) 

Pl(t ) 


cos t 

— sinf 

Pi(0) 

. P2(t) _ 


sint 

cost 

. P2 (0) 


Proof. If we plug the angular momentum function J into Hamilton’s equa¬ 
tions in place of H, we obtain 


dx i 

dJ 

dpi 

dJ 

dt 

= = ~ X2 '' 
op 1 

dt 

~ = P2 

OX i 

dx 2 

dJ 

dp 2 _ 

dJ 

dt 

= = XT, 

OP2 

dt 

= P l 

0X2 


The solution to this system is given by the expression in the proposition, 
as is easily verified by differentiation of (2.33). ■ 

Note that since the Hamiltonian flow generated by J does not have the 
interpretation of the time-evolution of the particle, the parameter t in (2.33) 
should not be interpreted as the physical time; it is just the parameter in a 
one-parameter group of diffcomorphisms. In this case, t is the angle of rota¬ 
tion. Thus, one answer to the question, “What is the angular momentum?” 
is that J is the Hamiltonian generator of rotations. 

If / is any smooth function, then by the proof of Proposition 2.25, the 
time derivative of any other function g along the Hamiltonian flow gener¬ 
ated by / is given by dg/dt = {g, /}. In particular, the derivative of the 
Hamiltonian H along the flow generated by / is {H, /}. Thus, / is constant 
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along the flow generated by H if and only if {/, H} = 0, which holds if and 
only if {/, H} = 0, which holds if and only if H is constant along the flow 
generated by /. This line of reasoning leads to the following result. 

Conclusion 2.31 A function f is a conserved quantity for solutions of 
Hamilton’s equation (2.25) if and only if H is invariant under the Hamil¬ 
tonian flow generated by f. In particular, the angular momentum J is con¬ 
served if and only if H is invariant under simultaneous rotations ofx and p. 

We will return to this way of thinking about conserved quantities in 
Chap. 21. Compare Exercise 12. 

The Hamiltonian framework can be extended in a straightforward way 
to systems of particles. 

Proposition 2.32 Consider the phase space for a system of N particles 
moving in M™, namely M . 2nN , thought of as the set of (2N)-tuples of the 
form 

(x 1 ,...,x Ar ,p 1 ,...,p jV ) 

with x 3 and p 3 belonging to M". Define the Poisson bracket of two smooth 
functions f and g on the phase space by 

r f 1 = y" y' f d f dff df dg \ 

hh\ d <dpi dpidxi) 


and consider a Hamiltonian function of the form 


rr/ 1 N 1 _ 

_ra(x,...,x ,p,...,p j — 


N 


' 2 TO 

j=1 


'p 3 \ 2 + V{x 1 ,...,x N ). 


Then Newton’s law in the form mjx 3 = — V 3 V is equivalent to Hamilton’s 
equations in the form 


dx[ _ dH 
dt dp{ 
dp{ _ dH 
dt dx{ 


(2.34) 


For any smooth function /, the derivative of f along a solution of Hamil¬ 
ton’s equations is given by 


df_ 

dt 




The proof of these results is entirely similar to the one-particle case and 
is omitted. 
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2.6 The Kepler Problem and the Runge-Lenz 
Vector 

2.6.1 The Kepler Problem 

We consider now the classical Kepler problem, that of finding the 
trajectories of a planet orbiting the sun. Since the sun is very much more 
massive than any of the planets, we may consider the position of the sun 
to be fixed at the origin of our coordinate system. The sun exerts a force 
on a planet given by 



Here k = GmM , where m is the mass of the planet, M is the mass of the 
sun, and G is the universal gravitational constant. Note that the magnitude 
of F is proportional to the reciprocal of the square of the distance from the 
origin; thus, the force follows an inverse square law. Since k contains a 
factor of the mass m of the planet, this quantity drops out of the equation 
of motion, mx = F. The potential associated with the force (2.35) is easily 
seen to be 

V(x) = -A (2.36) 

Since our potential V is invariant under rotations, the angular momentum 
vector J = x x p is a conserved quantity (Theorem 2.21 with N = 1 and 
n = 3). If J = 0, the particle is moving along a ray through the origin. 
In that case, either the particle will pass through the origin at some point 
in the future (if the initial momentum points toward the origin), or else 
the particle must have passed through the origin at some point in the past 
(if the initial momentum points away from the origin). Trajectories of this 
sort are called collision trajectories , and we will regard such trajectories as 
pathological. 

We will, from now on, consider only trajectories along which the angular 
momentum vector is nonzero. Fixing the energy and angular momentum of 
the particle guarantees that the particle stays a certain minimum distance 
from the origin (Exercise 20). Meanwhile, since J = x x p, the position 
x(t) of the particle will always be perpendicular to the constant value of J. 
We will therefore refer to the plane (through the origin) perpendicular to 
J as the “plane of motion.” 


2.6.2 Conservation of the Runge-Lenz Vector 

We are going to obtain a description of the classical trajectories in an 
indirect way, using something called the Runge-Lenz vector. 
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Definition 2.33 The Runge-Lenz vector is the vector-valued function 
on R 3 \{0} x R 3 given by 


A(x,p) 


i T _ 

mk ^ Ixl 


Here x represents the position of a classical particle and p its momentum. 


The significance of this vector is that it is a conserved quantity for the 
Kepler problem. Of course, whenever the potential energy is radial (a func¬ 
tion of the distance from the origin), the angular momentum vector is a 
conserved quantity. What is special about the 1/r potential of the Kepler 
problem is that there is another conserved vector-valued quantity. 

Proposition 2.34 The Runge-Lenz vector is conserved quantity for New¬ 
ton’s law with force given by (2.35). 

Proof. Since J is conserved, we compute that 


• 1 1 p x d |x| dx 7 - 

At - —F 

m h I at I m kr-l^ J- —' 


X|- “ dx 3 dt 


= XX fx x d!_— — 4- —— — 

m| x | 3 |x| m |x| 2 |x| m 


1 P 


1 


1 


1 


= — 3 X ( X ' P) + 3 P( X ' X ) - T-T 


c(x • p) 


m 
= 0 . 


3— ^ ' |X| 3^ ^ Ixl ' | X | 3 


Here we have used the identity b x (c x d) = c(b • d) — d(b- c), which holds 
for all vectors b,c,dgR 3 . ■ 


2.6.3 Ellipses, Hyperbolas, and Parabolas 


We now use the Runge-Lenz vector to determine the 
Kepler problem. 

Proposition 2.35 The magnitude of the Runge-Lenz 


lAr = i 


2JJT 

mk 2 


■E, 


trajectories for the 
vector A satisfies 


where E = |p| 2 /(2m) — k/ |x| is the energy of the particle. Furthermore, 
if x := x/ |x| is the unit vector in the x- direction, we have 


A x = 



mk |x| 


- 1 


(2.37) 
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for all nonzero x. It follows from (2.37) that 

, x , = _ !£!! _ . 

mk( 1 + A • x) 

Note that from (2.37), A • x > — 1 for all points (x, p) with x ^ 0. 
Proof. Using the identity b • (c x d) = d • (b x c), we see that 

X • (p X J) = J • (x X p) = I J| 2 / |x|. 


Since J and p are orthogonal, we get 

i a i 2 = ipi 2 i j i 2 + 1 - — I st ■ (p x j ) 

m z K z mk 

1 | 2|J| 2 /|p| 2 k\ 
mk 2 l 2 m |x| J 

, 2 IJI 2 

+ mk 2 E ' 

Using again the identity for b ■ (c x d), we next compute that 


A 1 T , , x- x 

Ax =^ J(xxp) -^ 


mk 


We may now divide by |x| to obtain the desired expression for A ■ x. It is 
then straightforward to solve for |x|. ■ 


Corollary 2.36 Choose orthonormal coordinates in the plane of motion 
so that A lies along the positive x\-axis. If r and 6 are the polar coor¬ 
dinates associated with this coordinate system, then along each trajectory 
( r(t ), 9(f)), we have 


r(t) 


|J| 2 1 

mk 1 + A cos 9(t) ’ 


(2.38) 


where A = |A| . 


If A = 0, any orthonormal coordinates can be used. 


Proposition 2.37 If A := |A| < 1, (2.38) is the equation of an ellipse with 
eccentricity A and with the origin being one focus of the ellipse. If A > 1, 
(2.38) is the equation of a hyperbola, and if A — 1, (2.38) is the equation 
of a parabola. 

The orbit of the particle in the plane of motion is an ellipse if the energy 
of the particle is negative, a hyperbola if the energy is positive, and a 
parabola if the energy is zero. 
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FIGURE 2.2. Elliptical orbit for the Kepler problem, with two equal areas shaded. 

Kepler’s first law is the assertion that planets move in elliptical trajec¬ 
tories with the sun at one focus, as shown in Fig. 2.2. The shaded regions 
indicate two equal areas that are swept out in equal times, in accordance 
with Kepler’s second law (Corollary 2.19). 

Recall that the eccentricity of an ellipse is yj\ — (b/a) 2 , where a is half 
the length of the major axis and b is half the length of the minor axis. 
Thus, when A = 0, we have b — a, meaning that the ellipse is a circle. 
Proof. We continue to work in a coordinate system in which A is along 
the positive aq-axis. Then (2.38) becomes 

Vx 2 +y 2 = a 1 x , 

where a = | J| 2 /( mk ). From this we obtain 



Now we can solve for x 2 + y 2 , square both sides of the equation, and 
simplify. Assuming A 2 ^ 1, we obtain 

“ 2 (i_ a 2 ) = ( x+ 1-A 2 ) +y2 ' < ' 2 ' 39 ^ 

This is the equation of an ellipse (if A 2 < 1) or a hyperbola (if A 2 > 1), 
where the center of the ellipse or hyperbola is the point (—a/(l — A 2 ),0). 
In light of the formula for A := |A| in Proposition 2.35, we obtain an ellipse 
if the energy of the particle is negative and a hyperbola if the energy is 
positive. 

In the case A 2 < 1, we may readily compute the half-lengths a and b of 
the major and minor axes as 

a a 

a= T b= 7T^p- 

From this, we readily calculate that the eccentricity is A. Now, the distance 
between the foci of an ellipse is the length of the major axis times the 
eccentricity, in our case, 2Aa/{\ — A 2 ). Since the center of the ellipse in 
(2.39) is at the point (Aa/( 1 — A 2 ), 0), the origin is one focus of the ellipse. 
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If A 2 = 1, then when we perform the same analysis, x 2 drops out of the 
equation and we obtain 


2 Aa 


H/ 2 + « 2 ) 


which is the equation of a parabola opening along the £-axis. This case 
corresponds to energy zero. ■ 

Note that Proposition 2.37 does not tell us how the particle moves along 
the ellipse, hyperbola, or parabola as a function of time. We can, however, 
determine this, at least in principle, by making use of the angular momen¬ 
tum. After all, applying (2.17) in the plane of motion gives 


dd 

dt 


1 

mr 2 


|J|, 


(2.40) 


where 9 is the polar angle variable in the plane of motion. Since we have 
computed r as a function of 6 in Corollary 2.36, (2.40) gives us a (first- 
order, separable) differential equation, from which we can attempt to solve 
to obtain 9 —and thus also r—as a function of t. 


2.6.4 Special Properties of the Kepler Problem 

As we have said, the existence of another conserved vector-valued function— 
in addition to the conserved energy and angular momentum—is special to 
a potential of the form —k/ |x| . For a general radial potential, the energy 
and the angular momentum will be the only conserved quantities. Assuming 
J / 0, the motion of a particle in any radial potential will always lie in the 
plane perpendicular to J. Taking this into account, we think of our particle 



FIGURE 2.3. Trajectory in the plane of motion for a typical radial potential. 
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as moving in R 2 rather than R 3 , and accordingly think of our phase space 
as being four-dimensional rather than six-dimensional. From this point of 
view, there are two remaining conserved quantities, the energy E and the 
scalar angular momentum J in the plane, as given by Definition 2.17. Thus, 
each trajectory will lie in a set of the form 

{ (x, p) £ R 2 x R 2 | E(x, p) = a, J(x, p) = 6} . 

We refer to such a set as a joint level set of E and J. These sets are two- 
dimensional surfaces inside our four-dimensional phase space. 

For a general radial potential, a trajectory (x(f),p(f)) in phase space 
may not be a closed curve, but may fill up a dense subset of the joint 
level surface on which it lives. In particular, the trajectory x(t) in position 
space will typically not be a closed curve. For example, x(f) may trace out 
a roughly elliptical region in the plane, but where the axes of the ellipse 
“precess,” that is, vary with time. Such a trajectory is shown in Fig. 2.3, 
which should be contrasted with Fig. 2.2. 

In the Kepler problem, even after restricting attention to the plane of 
motion, we still have one conserved quantity in addition to E and J, namely 
the direction of A, which can be expressed in terms of the angle (j> between 
A and the aq-axis in the plane of motion. (Note that both terms in the 
definition of A lie in the plane of motion. Note also that the magnitude of A 
is, by Proposition 2.35, computable in terms of E and J.) The trajectories 
of the Kepler problem, then, lie in the joint level sets of E and J and (j), 
which are one-dimensional. When E < 0, the joint level sets of E and J are 
compact, in which case the joint level sets of E and J and <j> are compact 
and one-dimensional, that is, simple closed curves. 

Another special property of the Kepler problem is that the period of the 
closed trajectories (the trajectories with negative energy) is the same for all 
trajectories with the same energy (Exercise 21). This apparent coincidence 
can be explained by showing that the Hamiltonian flows (Definition 2.28) 
generated by J and A act transitively on the energy surfaces. These flows 
commute with the time evolution of the system, because they are all con¬ 
served quantities (Conclusion 2.31). Thus, any two points with the same 
energy are “equivalent” with respect to time evolution. Although we will 
not go into the details of this analysis, we will gain a better understanding 
of the flows generated by the components of A in Sect. 18.4. 


2.7 Exercises 

1. Consider a particle moving in the real line in the presence of a force 
coming from a potential function V. Given some value Eq for the 
energy of the particle, suppose that V(x) < Eq for all x in some 
closed interval [xq,Xi\. Then a particle with initial position xq and 
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positive initial velocity will continue to move to the right until it 
reaches X\. Using (2.6), show that the time needed to travel from xo 
to x\ is given by 


‘ = £' \l 2 (£„ -vm dy - 

Note: This shows that we can solve Newton’s equation in R 1 more 
or less explicitly for time as a function of position, which in principle 
determines the position as a function of time. 

2. In the notation of the previous problem, suppose now that V ( x ) < Eq 
for Xq < x < xi, but that V(a;i) = E 0 . 

(a) Show that if V'(x\) ^ 0, then the particle reaches x\ in a finite 
time. 

(b) Show that if V'[xi) = 0, then the time it takes the particle to 
reach x\ is infinite; that is, the particle approaches but never 
actual reaches x\. 


Note: In Part (b), the point x\ is an unstable equilibrium for the 
system, that is, a critical point for V that is not a local minimum. 


3. Consider the equation of motion of a pendulum of length L, 


d 2 9 
dt 2 



where g is the acceleration of gravity. Here 0 is the angle between the 
pendulum and the negative y-axis in the plane. This system has a 
stable equilibrium at 6 = 0 and an unstable equilibrium at 0 = n. 

Consider initial conditions of the form 0(0) = n — 6, 0(0) = 0, for 
0 < <5 < 7t/4. Fix some angle 9q and let T(S) denote the time it takes 
for the pendulum with the given initial conditions to reach the angle 
0o- (Here 0o represents an arbitrarily chosen cutoff point at which the 
pendulum is no longer “close” to 0 = 7r.) Show that T(6) grows only 
logarithmically as S tends to zero. 

Note: Logarithmic growth of T as a function of <5 corresponds to 
exponential decay of S as a function of T. Thus, if we want T to be 
large, we must choose 8 to be very small. 


4. Consider a particle moving in the real line in the presence of a 
“repelling potential,” such that there is an A with V'{x) < 0 for 
all x > A. Then a particle with initial position xq > A and positive 
initial velocity will have positive velocity for all positive times. Sup¬ 
pose now that V(x) = —x a for all x > 1, for some positive constant 
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a. Suppose also that the particle is given initial position Xq > 1 and 
positive initial velocity. Show that for a > 2, the particle escapes to 
infinity in finite time, but that for a < 2 , the position of the particle 
remains finite for all finite times. 

Hint: Use Problem 1. 


5. Consider the equation mx + yi + kx = 0, where 7 and k are positive 
constants (the damping constant and spring constant, respectively). 
Find the critical value y c of 7 (for a fixed m and k) such that for 
7 < 7c we get solutions that are sines and cosines times a decaying 
exponential and for 7 > 7 C , we get pure decaying exponentials. 

6 . Continue with the notation of Exercise 5. Given particular choices 
for m, 7, and k, let r be the rate of exponential decay of a “generic” 
solution to the equation of motion. Here, if the solution is of the form 
ae~ rt cos(u it) + be~ rt sin(wi), the rate of exponential decay is r. If the 
solution is of the form ae~ rit + be~ r2t , then r = min(ri,r 2 ), since 
the slower-decaying term will dominate as long as a and b are both 
nonzero. 

For a fixed value of m and k, show that the maximum value for r 
is achieved by taking 7 = 7 C . (This accounts for the terminology 
“critical damping” for the case in which 7 = 7 C .) 

7. Consider the Revalued function F on R 2 \ {0} given by 


F(x 1 ,x 2 ) 


x 2 


Xi 


o -2 _i_ T 2 7 ^,2 _i_ 

•aj -| O/i | Uyf- 


Show that dF\/dx 2 — dF 2 /dx 1 = 0 but that there does not exist any 
smooth function V on R 2 \ {0} with F = — W. 

Hint: If F were of the form — VV) we would have 


V (x(&)) — V (x(a)) = — j F (x(t)) • ^ df 

for every smooth path x(-) : [a, b) -A R 2 \{ 0 }, by the fundamental 
theorem of calculus and the chain rule. 


8 . Consider a particle moving in M n with a velocity-dependent force law 
given by 

F(x,v) = — W(x) + F 2 (x,v), 

where the velocity-dependent term F 2 acts perpendicularly to the 
velocity of the particle. (That is, we assume that v • F 2 (x, v) = 0 
for all x and v.) Let E denote the usual energy function E(x,v) = 
\m |v| 2 + U(x), unmodified by the presence of the velocity-dependent 
term in the force. Show that E is conserved. 
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9. (a) If r and 9 are the usual polar coordinates on R 2 , compute d6/dx\ 

and 89/8x2- 

(b) If x(-) denotes the trajectory of a particle of mass m moving in 
R 2 , show that 

^( X (0) = ^ 2 'J(x(t),p(t)). 

10. Prove Theorem 2.21, by imitating the proof of Proposition 2.18. You 
may assume that every rotation can be built up as a product of 
repeated rotations in the various coordinate planes (i.e., rotations in 
the ( Xj,Xk ) plane, for various pairs (j, k), where the same plane may 
be used more than once). 

11. Consider Hamilton’s equations for N particles moving in R n , as in 
Proposition 2.32. Show that the total momentum p = ]Cy=i P J °f the 
system is a conserved quantity if and only if the quantity 

H(x x + a,. .. ,x N + a, p 1 + a,.. . , p v + a), a £ R”, 

is independent of a for all x 1 ,..., x w and p 1 ,..., p N in R”. 

Hint: Use (the IV-particle version of) Conclusion 2.31. 

12. Let J denote the angular momentum of a particle moving in R 2 . 
Let Rg denote a counterclockwise rotation by angle 9 in R 2 . 

(a) If / is any smooth function on R 4 , show that 

{/. J} (x, p) = -^/ (-Rex, Rep) 

dv 0=0 

(b) Let H be any smooth function on R 4 and consider Hamilton’s 
equations with this function playing the role of the Hamilto¬ 
nian. Show that J is conserved (i.e., constant in time along any 
solution of Hamilton’s equations) if and only if 

H(Rox, Rgp) = H(x. p) 

for all 9 in R and all x and p in R 2 . (This argument is a more 
explicit way to obtain Conclusion 2.31.) 

13. Suppose that / and g are smooth functions on R 2 " and that at least 
one of the two functions has compact support. Show that 

[ [ {/> 2 }(x,p) d n x d n p = 0. 

JR n JR n 


Hint : Use integration by parts or Liouville’s theorem. 
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14. Let A and Y be “vector fields” onR", viewed as first-order differential 
operators. This means that A' and Y are of the form 

n „ n « 

o =i J j=i J 

[If A(x) = (oi(x),..., a„(x)), then the operator A is the directional 
derivative in the direction of X. It is common to identify the vector¬ 
valued function X with the associated first-order differential operator 
A.] 

Show that the commutator [A, Y] of X and Y. defined by 
[A, Y]=XY - YX 

is again a vector field (i.e., a first -order differential operator). 

15. Given a smooth function / on R 2rl , define an operator A/, acting on 
C°° (M 2n ), by the formula 


x f(g ) = if,g}- 

That is to say, 

f \ dx j a 'Pi dpjdxj)' 

The operator Xf is called the Hamiltonian vector field associated 
with the function /. (Here, as in Exercise 14, we identify vector fields 
with first-order differential operators.) 

(a) Show that for all f,g £ C'°°(]R 211 ), we have 

*{/,,} = [Xf,X g ], 
where [Xf,X g } = X f Xg-X g X f . 

Hint: By Exercise 14, all terms in the computation of [Xf, X g \(h) 
involving second derivatives of h can be neglected, since they will 
always cancel out to zero. 

(b) Use Part (a) to compute {{/, g}, h} = Xtf g y (h) and thereby ob¬ 
tain another proof of the Jacobi identity for the Poisson bracket. 


16. Recall the definition of a Hamiltonian vector field Xf in Exercise 15. 

(a) Consider a smooth vector field X on R 2 (viewed as a first-order 
differential operator as in Exercise 14) of the form 


A(x) = gi(x,p) 


d_ 

dx 


g2(x,p) 


d_ 

dp' 
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Show that X can be expressed as X = Xf, for some / e 
C°° (R 2 ), if and only X is divergence free , that is, if and only 
if 

dg 1 dg 2 
dx dp 


V • X := 


= 0 . 


Hint: As in Proposition 2.7, given a pair of functions h\ and /12 
on R 2 , there exists a function / with df /dx = /ii and df /dp = 
h 2 if and only if we have dhi/dp = dh^/dx. 

(b) Show that there exists a smooth vector field X on R 4 of the form 


X 


1 

E 

j= 1 


, . d d 

+5i+2( x )^r 




such that 



dffj+2 A 
/ 


= 0 


but such that there does not exist / G (^“(K 4 ) with X = Xf. 
Hint: You should be able to find a counterexample in which the 
coefficient functions gj are linear. 


17. Show that the space of homogeneous polynomials of degree 2 on R 2 ™ 
is closed under the Poisson bracket. 


18. Determine the Hamiltonian flow on M 2 generated by the function 
.f{x,p) = xp. 

19. Let J denote the angular momentum vector for a particle moving in 
]R 3 , namely J = x x p. Show that the components Ji, J 2 , and J 3 of 
J satisfy the following Poisson bracket relations: 


{Jl, Jl} — J3', {Jit Ji} — Jl] { J3 , J\ } J“ 2 . ■ 


20. In the Kepler problem, show that for each real number E and positive 
number J, there exists e > 0 such that for all (x, p) with E(x, p) = E 
and |J(x, p)| = J, we have |x| > e. 

Hint: Suppose that (x n ,p n ) is a sequence with |J(x„,p„)| = J and 
|x„| tending to zero. Show that i5(x„,p„) tends to + 00 . 

21. (a) Determine the area of the ellipse in the plane of motion in Propo¬ 

sition 2.37, in the case A < 1. 

(b) Show that the time T it takes the particle to travel once around 
the ellipse is given by 


V2 


GM(-E )~ 3 / 2 
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where E is the “massless energy” of the particle, given by 



m 



GM 


Note in the case where the trajectory in the plane of motion is 
elliptical, the energy of the particle is negative. 

Note: The result of Part (b) is closely related to Kepler’s third law. 



3 

A First Approach to Quantum 
Mechanics 


In this chapter, we try to understand the main ideas of quantum mechanics. 
In quantum mechanics, the outcome of a measurement cannot—even in 
principle—be predicted beforehand; only the probabilities for the outcome 
of the measurement can be predicted. These probabilities are encoded in a 
wave function , which is a function of a position variable x G R ra . The square 
of the absolute value of the wave function encodes the probabilities for the 
position of the particle. Meanwhile, the probabilities for the momentum of 
the particle are encoded in the frequency of oscillation of the wave function. 
The probabilities can be described using the position operator and the 
momentum operator. The time-evolution of the wave function is described 
by the Hamiltonian operator, which is analogous to the Hamiltonian (or 
energy) function in Hamilton’s equations. 


3.1 Waves, Particles, and Probabilities 

There are two key ingredients to quantum theory, both of which arose from 
experiments. The first ingredient is wave-particle duality, in which objects 
are observed to have both wavelike and particlelike behavior. Light, for 
example, was thought to be a wave throughout much of the nineteenth 
century, but was observed in the early twentieth century to have parti¬ 
cle behavior as well. Electrons, meanwhile, were originally thought to be 
particles, but were then observed to have wave behavior. 
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The second ingredient of quantum theory is its probabilistic behavior. 
In the two-slit experiment, for example, electrons that are “identically 
prepared” do not all hit the screen at the same point. Quantum theory 
postulates that this randomness is fundamental to the way nature behaves. 
According to quantum mechanics, it is impossible (theoretically, not just 
in practice) to predict ahead of time what the outcome of an experiment 
will be. The best that can be done is to predict the probabilities for the 
outcome of an experiment. 

These two aspects of quantum theory come together in the wave function. 
The wave function is a function of a variable x G 1", which we interpret as 
describing the possible values of the position of a particle, and it evolves in 
time according to a wavelike equation (the Schrodinger equation). The wave 
function and its time-evolution account for the wave aspect of quantum 
theory. The particle aspect of the theory comes from the interpretation of 
the wave function. Although it is tempting to interpret the wave function 
as a sort of cloud, where we have, say, a little bit of electron-cloud over 
here, and little bit of electron-cloud over there, this interpretation is not 
consistent with experiment. Whenever we attempt to measure the position 
of a single electron, we always find the electron at a single point. A single 
electron in the two-slit experiment is observed at a single point on the 
screen, not spread out over the screen the way the wave function is. The 
wave function does not describe something that is directly observable for a 
single particle; rather, the wave function determines the statistical behavior 
of a whole sequence of identically prepared particles. See Fig. 1.4 for a 
dramatic experimental demonstration of this effect. 

In the two-slit experiment, for example, it is possible to determine how 
the wave function behaves as a function of time by solving the (determin¬ 
istic) Schrodinger equation. Knowledge of the wave function of an individ¬ 
ual electron, however, does not determine where that electron will hit the 
screen. The wave function merely tells us the probability distribution for 
where the electron might hit the screen, something that is only observable 
by shooting a whole sequence of electrons at the screen. 

It is an oversimplification, but a useful one, to describe the wave-particle 
aspect of quantum theory in this way: a single electron (or photon, or 
whatever) acts like a particle, but a large collection of electrons behaves 
like a wave. A single measurement of a single electron always gives its 
position as a point, just as we would expect for a particle. This point, 
however, varies from one electron to the next, even if we shoot each electron 
toward the screen in precisely the same way. Repeated measurements of 
identically prepared electrons give a distribution that can, for example, 
exhibit interference patterns, just as we would expect for a wave. See, again, 
Fig. 1.4, which should be compared to Figs. 1.1 and 1.2. 

It is interesting to note that at the macroscopic scale, where quantum ef¬ 
fects are not apparent, light appears to be a wave, whereas electrons appear 
to be particles. This is the case even though both light and electrons are 
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really wave-particle hybrids, described in probabilistic terms by a wave 
function. The difference between the two situations is that photons (the par¬ 
ticles of light) have mass zero, whereas electrons have positive mass. This 
means that photons, unlike electrons, can easily be created and destroyed 
even at low energies. Thus, the discrete aspect of light—namely, that the 
energy in light comes only in discrete “quanta,” namely the photons—is 
less evident than the corresponding discrete aspect of electrons. 


3.2 A Few Words About Operators 
and Their Adjoints 

In quantum mechanics, physical quantities—such as position, momentum, 
and energy—are represented by operators on a certain Hilbert space H. 
These operators are unbounded operators, reflecting that in classical me¬ 
chanics, these quantities are unbounded functions on the classical phase 
space. In this section, we look briefly at some technical issues related to 
unbounded operators and their adjoints. We will delay a full discussion of 
these technicalities (Chap. 9) until after we have understood the basic ideas 
of quantum mechanics. 

Here and throughout the book, H will represent a Hilbert space over C, 
always assumed to be separable. We follow the convention in the physics 
literature that the inner product be linear in the second factor: 

(<Mb) = A (3,d); (A3A) = a (</>, ip) 

for all (f>, ip € H and all A € C. 

Recall (Appendix A.3.4) that a linear operator A : H — > H is bounded 
if there is a constant C such that \\Aip\\ < C ||z/>|| for all ip £ H. For any 
bounded operator A, there is a unique bounded operator A *, called the 
adjoint of A , such that 


{<P,Aip) = (A*cp,ip) 

for all <p, -0 6 H. The existence of A* follows from the Riesz theorem (Ap¬ 
pendix A.4.3), by observing that for each fixed </>, the map ip H > (cp,Aip) 
is a bounded linear functional on H. A bounded operator is said to be 
self-adjoint if A* = A. 

For various reasons, both physical and mathematical, we want the 
operators of quantum mechanics operators to be self-adjoint. Once one 
sees the formulas for these operators, however, one is confronted with a 
serious technical difficulty: the operators are not bounded. 

If A is a linear operator defined on all of H and having the property 
that {(/), Aip) = ( A(p , ip) for all (p, ip £ H, then A is automatically bounded. 
(See Corollary 9.9.) To put this fact the other way around, an unbounded 
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self-adjoint operator cannot be defined on the entire Hilbert space. Thus, to 
deal with the unbounded operators of quantum mechanics, we must deal 
with operators that are defined only on a subspace of the relevant Hilbert 
space, called the domain of the operator. 

Definition 3.1 An unbounded operator A on H is a linear map from 
a dense subspace Dom(A) C H into H. 

More precisely, the operator A is “not necessarily bounded,” since noth¬ 
ing in the definition prevents us from having Dom(A) = H and having A 
be bounded. 

In defining the adjoint of an unbounded operator, we immediately en¬ 
counter a difficulty: for a given cp £ H, the linear functional (<p,A-) may 
not be bounded, in which case we cannot use the Riesz theorem to define 
A*(f>. What this means is that the adjoint of A, like A itself, will be defined 
not on all of H but only on some subspace thereof. 

Definition 3.2 For an unbounded operator A on H, the adjoint A* of A 
is defined as follows. A vector cp & H belongs to the domain Dom(A*) of 
A* if the linear functional 

defined on Dom(A), is bounded. For (f> £ Dom(H*), let A*cp be the unique 
vector x such that 

for all if £ Dom(A). 

Saying that the linear functional ( (f >, A- ) is bounded means that there is 
a constant C such that \(<p, Aip)\ < C ||^>|| for all ip £ Dom(A). If (c p,A •) is 
bounded, then since Dom(A) is dense, the BLT theorem (Theorem A.36) 
tells us that (cp, A- ) has a unique bounded extension to all of H. The Riesz 
theorem then guarantees the existence and uniqueness of %. The adjoint of 
an unbounded linear operator is a linear operator on its domain. 

We are now ready to define self-adjointness (and some related notions) 
for unbounded operators. 

Definition 3.3 An unbounded operator A on H is symmetric if 

{<P, Aip) = ( Acp , ip) 

for all cp, ip £ Dom(A). The operator A is self-adjoint if Dom(H*) = 
Dom(A) and A*(p = Acp for all cp £ Dom(H). Finally, A is essentially 
self-adjoint if the closure in H x H of the graph of A is the graph of a 
self-adjoint operator. 

That is to say, A is self-adjoint if A* and A are the same operator with 
the same domain. Every self-adjoint or essentially self-adjoint operator is 
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symmetric, but not every symmetric operator is essentially self-adjoint. 
For any symmetric operator, Dom(A*) Z> Dom(A) and A* agrees with A 
on Dom(A). The reason a symmetric operator may fail to be self-adjoint is 
that Dom(A*) may be strictly larger than Dom(A). 

Although the condition of being symmetric is certainly easier to 
understand (and to verify) than the condition of being self-adjoint, self¬ 
adjointness is the “right” condition. In particular, the spectral theorem, 
which is essential to much of quantum mechanics, applies only to operators 
that are self-adjoint and not to operators that are merely symmetric. If A 
is essentially self-adjoint, then we can obtain a self-adjoint operator from 
A simply by taking the closure of the graph of A, and we can then apply 
the spectral theorem to this self-adjoint operator. Thus, for may purposes, 
it is enough to have our operators be essentially self-adjoint rather than 
self-adjoint. 

It is generally easy to verify that the operators of quantum mechanics 
(those representing position, momentum, and so forth) are symmetric on 
some suitably chosen domain. Proving that these operators are essentially 
self-adjoint, however, is substantially more difficult. Although establishing 
essential self-adjointness is a crucial technical issue, it is best not to worry 
too much about it on a first encounter with quantum mechanics. In this 
chapter, we will not concern ourselves overly with technical details con¬ 
cerning essential self-adjointness and the precise choice of domain for our 
operators, depending on Chap. 9 to take care of such matters. For now, we 
content ourselves with deriving some very elementary properties of sym¬ 
metric (and thus also self-adjoint) operators. 

Proposition 3.4 Suppose A is a symmetric operator on H. 

1. For all ip G Dom(A), the quantity {ip, Aip) is real. More generally, if 
ip, Aip,... ,A m ~ 1 ip all belong to Dom(A), then {ip,A m ip) is real. 

2. Suppose A is an eigenvector for A, meaning that Aip = Xip for some 
nonzero ip £ Dom(A). Then A 6 R. 

Proof. Since A is symmetric, we have 

(4’, Aip) = {Aip, ip) = {ip, Aip) 

for all ip € Dom(A). If ip, Aip,... ,A m ~ 1 ip all belong to the domain of A, 
we can use the symmetry of A repeatedly to show that 

{ip,A m ip) = {A m ip,ip) = {ip,A m ip). 

Meanwhile, if ip is an eigenvector for A with eigenvalue A, then 
A {ip, ip) = {ip, Aip) = {Aip, ip) = A {ip, ip). 

Since ip is assumed to be nonzero, this implies that A = A. ■ 
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Physically, {ip, Aip) represents—as we will see later in this chapter— 
the expectation value for measurements of A in the state ip, whereas the 
eigenvalue A represents one of the possible values for this measurement. 
On physical grounds, we want both of these numbers to be real. If A is 
self-adjoint, and not just symmetric, then the spectral theorem will give 
a canonical way of associating to each ip £ Ha probability measure on 
the real line that encodes the probabilities for measurements of A in the 
state ip. 


3.3 Position and the Position Operator 


Let us consider at first a single particle moving on the real line. The wave 
function for such a particle is a map ip : R 1 —>■ C. Although this map will 
evolve in time, let us think for now that the time is fixed. The function 
\ip{x)\ 2 is supposed to be the probability density for the position of the 
particle. This means that the probability that the position of the particle 
belongs to some set E C M 1 is 


\ip{x)\ dx. 


For this prescription to make sense, ip should be normalized so that 



(3.1) 


That is, ip should be a unit vector in the Hilbert space L 2 (M). 

Now, if the function |^(a;)| 2 is the probability density for the position of 
a particle, then according to the standard definitions of probability theory, 
the expectation value of the position will be 


E{x) 


= / x\ip{x)\ 2 dx, 

Jr 


(3.2) 


provided that the integral is absolutely convergent. More generally, we can 
compute any moment of the position (i.e., the expectation value of some 
power of the position) as 

E{x m ) = f x m \ip{x)\ 2 dx, (3.3) 

Jr 

assuming, again, the convergence of the integral. 

A key idea in quantum theory is to express expectation values of various 
quantities (position, momentum, energy, etc.) in terms of operators and 
the inner product on the relevant Hilbert space, in this case, L 2 (R). In the 
case of position, we may introduce the position operator X defined by 

( Xip){x ) = xip{x). 
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That is, X is the “multiplication by x” operator. The point of introducing 
this operator is that the expectation value of the position [defined in (3.2)] 
may now be expressed as 


E(x) = {ip, Xip ), 

where the inner product is the usual one on L 2 (R): 



(Recall that we are following the physics convention of putting the conju¬ 
gate on the first factor in the inner product.) 

We use the following notation for the expectation value of the operator 
X in the state ip: 


(X)^ := (ip,XiP). 


The higher moments of the position, as defined in (3.3), are also computable 
in terms of the position operator: 


E{x m ) = (ip,X m ip). 


At this point, it is not clear that we have gained anything by writing 
our moments in terms of an operator and the inner product instead of in 
terms of the integral (3.3). The operator description will, however, motivate 
a parallel description of moments for the momentum, energy, or angular 
momentum of a particle in terms of corresponding operators. 

It should be noted that, for a given ip € L 2 (R), Xip might fail to be in 
L 2 (R). This failure of X to be defined on all of our Hilbert space reflects 
that X is an unbounded operator, something that we discussed briefly in 
Sect. 3.2. Even if Xip is in L 2 (R), X m ip might fail to be in L 2 (R) for some 
to. Nevertheless, for any unit vector ip in L 2 (R), we have a well-defined 
probability density on R, given by |^(a;)| . 

3.4 Momentum and the Momentum Operator 

At any fixed time, the wave function ip(x) of a particle (according to the 
wave theory postulated by Schrodinger) is a function of a “position” vari¬ 
able x only. Although the wave function ip directly encodes the probabilities 
for the position of the particle, through \ip(x)\ 2 , it is not as clear how in¬ 
formation about the particle’s momentum is encoded. As it turns out, the 
momentum is encoded in the oscillations of the wave function. A crucial 
idea in quantum mechanics is the de Broglie hypothesis, which we intro¬ 
duced in Sect. 1.2.2 as a way of understanding the allowed energies in the 
Bohr model of the hydrogen atom. The de Broglie hypothesis proposes 
a particular relationship between the frequency of oscillation of the wave 
function—as a function of position at a fixed time—and its momentum. 
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Proposition 3.5 (de Broglie hypothesis) If the wave function of a 
particle has spatial frequency k 7 then the momentum p of the particle is 

p = Hk , (3.4) 


where h is Planck’s constant. 

The Davisson-Germer electron-diffraction experiments, described in Sect. 
1.2.3, strongly support not only the idea that electrons have wavelike 
behavior, but also the specific relationship (3.4) between the momentum 
of an electron and the spatial frequency of the associated wave. Of course, 
Proposition 3.5 is rather vague. To be a bit more precise, Proposition 3.5 is 
supposed to mean that a wave function of the form if(x) = e lkx represents 
a particle with momentum p = hk. [Here, as in Chap. 2, “frequency” is in 
the angular sense. The cycles-per-unit-distance frequency is v = fc/(27r).] 

Now, the function e lkx is obviously not square integrable, so it is not 
strictly possible for the wave function [which is supposed to satisfy (3.1)] 
to be e lkx . Let us therefore briefly switch to thinking of a particle on a circle, 
so that we can avoid certain technicalities. We think of the wave function 
if for a particle on a circle as a 2-7r-periodic function on M, satisfying the 
normalization condition 



dx = 1. 


For any integer fc, it makes sense to say that the normalized wave function 
if(x) = e lkx /\/ 2n represents a particle with momentum p = hk. In this case, 
we are supposed to think that the momentum of the particle is definite, 
that is, nonrandom. If the particle’s wave function is e lkx /\/27r, then a 
measurement of the particle’s momentum should (with probability 1) give 
the value hk. 

Now, the functions e lkx /\/2 tt 7 k £ Z, form an orthonormal basis for the 
Hilbert space of 27r-periodic, square-integrable functions, which may be 
identified with L 2 ([0, 27 t]). Thus, the typical wave function for a particle on 
a circle is 


if(x) 



(3.5) 


where the sum is convergent in L 2 ([0, 2n]). If if is normalized to be a unit 
vector, then we have 


M 2 = U\\lmo,2ir]) = 1 - ( 3 - 6 ) 

k——oo 


For a particle with wave function given by (3.5), the momentum of the 
particle is no longer definite. Rather, we are supposed to think that a 
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measurement of the particle’s momentum will yield one of the values hk, 
k £ Z, with the probability of getting a particular value hk being |afc| 2 . 
Following elementary probability theory, then, the expectation values for 
the momentum should be 

OO 

E(jp)= (3.7) 

k=— oo 

and higher moments for the momentum should be 

OO 

% m )= ^ (ttfKI 2 , (3.8) 

k=— oo 


assuming absolute convergence of the sum. 

We would like to encode the moment conditions (3.7) and (3.8) in a 
momentum operator P , which should be defined in such a way that if the 
particle’s wave function ip is given by (3.5), then E(p m ) = {ip, P m ip). 
We can achieve this relation if P satisfies 

Pe ikx = hke ikx , (3.9) 

since then, 

OO 

(iP, P m iP) = ]T (i Hk) m \a k \ 2 = E(p m ). (3.10) 

k=— oo 

The (presumably unique) choice for P satisfying (3.9) is 



Returning now to the setting of the real line, it is natural to postu¬ 
late that the momentum operator P on the line should also be given by 
P = —ih d/dx. This operator satisfies the relation 

Pe ikx = ( Hk)e ikx , 


which is supposed to capture the idea that the wave function e lkx has 
momentum hk. Although the function e lkx is not square-integrable with re¬ 
spect to x, the Fourier transform allows us to build up any square-integrable 
function as a “superposition” of functions of the form e lkx . (Superposition 
is the term physicists use for a linear combination or the continuous analog 
thereof, namely an integral.) This means that [by analogy to (3.5)] we have 

ip{x) = .— / e lkx ip{k ) dk, (3-11) 

v 27T J —oo 

where ip{k) is the Fourier transform of ip, defined by 
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(See Appendix A.3.2 for information about the Fourier transform.) 

The Plancherel theorem (Theorem A. 19) then tells us that the Fourier 
transform is a unitary map of L 2 (R) onto L 2 (R). Thus, for any unit vector 
if e L 2 (R), 



dk = 1. 


In light of what we have in the circle case, it is natural to think that \if(k)\ 2 
is essentially the probability density for the momentum of the particle. 
(To be precise, \if(k)\ 2 is the probability density for p/H.) 

We can now express the properties of the momentum operator entirely 
within the Hilbert space L 2 (R), without making explicit mention of the 
non-square-integrable functions e lkx . 


Proposition 3.6 Define the momentum operator P by 



Then for all sufficiently nice unit vectors if in L 2 (R), we have 


(if,P m if) 



(, hk) m if{k) 


dk 


(3.13) 


for all positive integers m. The quantity in (3.13) is interpreted as the 
expectation value of the mth power of the momentum, E(p m ). 

Equation (3.13) should be compared to (3.10) in the case of the circle. 
Proof. If if is in, say, the Schwartz space (Definition A. 15), then, by ap¬ 
plying Proposition A.17 m times, we see that the Fourier transform of the 
nth derivative of if is ( ik) m if(k ), and so the Fourier transform of P m if is 
(hk) m if{k). Meanwhile, since the Fourier transform is unitary, we have 


(if,P m if) 



if(k)(hk) Tn if(k) dk, 


which gives (3.13). (The assumption that if be in the Schwartz space is 
stronger than necessary. The reader is invited to use integration by parts 
and the definition of the Fourier transform to find weaker assumptions that 
allow the same conclusion.) ■ 


3.5 The Position and Momentum Operators 

In the following definition, we summarize what we have learned, in the two 
previous sections, about the position and momentum operators. 
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Definition 3.7 For a particle moving in R 1 , let the quantum Hilbert space 
be L 2 (R) and define the position and momentum operators X and P 
by 

Xip(x) = xip(x) 

Pip(x) = — iH—j—. 

dx 

Neither the position nor the momentum operator is defined as mapping 
the entire Hilbert space L 2 (R) into itself. After all, for ip £ L 2 (R), the 
function xip(x) may fail to be in L 2 (R). Similarly, a function ip in L 2 (R) may 
fail to be differentiable, and even if it is differentiable, the derivative may fail 
to be in L 2 (R). What this means is that X and P are unbounded operators, 
of the sort discussed briefly in Sect. 3.2. They are defined on suitable dense 
subspaces Dom(X) and Dom(P) of L 2 (R). We defer a detailed examination 
of the domains of these operators until Chap. 9. 

A vitally important property of this pair of operators is that they do not 
commute. 


Proposition 3.8 The position and momentum operators X and P do not 
commute, but satisfy the relation 

XP ~ PX = itu, (3.14) 


This relation is known as the canonical commutation relation. 
Proof. Using the product rule we calculate that 


PXip = —ih— ( xipix )) 
dx 

= —ihip{x) — ihx-f- 
dx 

= —ifnp{x) + XPip, 


from which (3.14) follows. ■ 

There are many important consequences of the relation (3.14), which we 
will examine at length in Chaps. 11- 14 of the book. For now, we simply note 
a parallel between (3.14) and the Poisson bracket relationship in classical 
mechanics: {x,p} = 1, as follows directly from the definition of the Poisson 
bracket. This hints at an analogy, which we will explore further in Sect. 3.7, 
between the commutator of two operators A and B on the quantum side 
(namely, the operator AB — BA) and the Poisson bracket of two functions 
/ and g on the classical side. 

Proposition 3.9 For all sufficiently nice functions cp and ip in L 2 (R), 
we have 

{(p,Xip) = (X(p,ip) 


and 


{(/>, P^) = {P<l>, ' 0 ) ■ 
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Proof. Suppose that <p and ip belong to L 2 (R) and that the functions x<p(x ) 
and xip{x ) also belong to L 2 (R). Then since x is real, we have 

/ OO /‘OO 

< f>(x)xip(x ) dx = xcp{x)ip(x) dx , 

-OO j — OO 

where both integrals are convergent because they are both integrals of the 
product of two L 2 functions. 

Meanwhile, for the second claim, let us assume that <p and ip are con¬ 
tinuously differentiable and that (p(x) and ip{x) tend to zero as x tends to 
±oo. Let us also assume that <j>, ip, d(p/dx and dip/dx belong to L 2 ( R). We 
note that dcp/dx is the same as dcp/dx. Thus, using integration by parts, 
we obtain 


—ih 



-dip 


cp(x) — dx = —ih cp(x)ip(x) 
dx 


A 

— A 



- -ip(x) dx. 
dx 


Under our assumptions on <p and ip, as A tends to infinity, the bound¬ 
ary terms will vanish and the remaining integrals will tend (by dominated 
convergence) to integrals over the whole real line. Thus, 




dx = ih f ip(x ) dx 

J -oo dx 

= J ^—ih-j^Jip^x) dx, 


which is the second claim in the proposition. ■ 

In the language of Definition 3.3, Proposition 3.9 means that X and P 
are symmetric operators on certain dense subspaces of L 2 (R) (the space of 
functions for which the proposition is proved). It is actually true that X 
and P are essentially self-adjoint on these domains. The proof of essential 
self-adjointness, however, will have to wait until Chap. 9. 


3.6 Axioms of Quantum Mechanics: Operators 
and Measurements 

In this section we consider the general “axioms” of quantum mechanics. 
These axioms are not to be understood in the mathematical sense as rules 
from which all other results are derived in a strictly deductive fashion. 
Rather, the axioms are the main principles of how quantum mechanics 
works. Here we look at the “kinematic” axioms, those that apply at one 
fixed time. There is one additional axiom, governing the time-evolution of 
the system, which we consider in the next section. 
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Axiom 1 The state of the system is represented by a unit vector if in an 
appropriate Hilbert space H. If ifi and if% are two unit vectors in H with 
if 2 = cif i for some constant c € C, then ifi and if 2 represent the same 
physical state. 

The Hilbert space H is frequently called the “quantum Hilbert space.” 
This does not, however, mean that H is some variant of the notion of a 
Hilbert space, the way a quantum group is a variant of the notion of a 
group. Rather, “quantum Hilbert space” means simply, “the Hilbert space 
associated with a given quantum system.” 

In Axiom 1, it should be noted that unit vectors in H actually represent 
only the “pure states” of the theory. There is a more general notion of a 
“mixed state” (described by a “density matrix”) that we will consider in 
Chap. 19. We will follow the custom in most physics texts of considering at 
first only pure states. 

Axiom 2 To each real-valued function f on the classical phase space there 
is associated a self-adjoint operator f on the quantum Hilbert space. 

In almost all cases, the operator / is unbounded. This unboundedness 
is unsurprising when we realize that physically relevant functions / on 
the classical phase space (e.g., position and momentum) are unbounded 
functions. In the unbounded case, the notion of self-adjointness is rather 
technical; see Definition 3.3 in Sect. 3.2. In most applications, it is not 
really necessary to define / for all functions on the classical phase space, 
but only for certain basic functions, such as position, momentum, energy, 
and angular momentum. We will describe the quantizations of these basic 
functions in this chapter. If one really needs to define / for an arbitrary 
function / (satisfying some regularity assumptions), the standard approach 
is to use the Weyl quantization scheme, described in Chap. 13. 

For a particle moving in R 1 , the classical phase space is R 2 , which we 
think of as pairs ( x,p ) with x being the particle’s position and p being 
its momentum. The quantum Hilbert space in this case is usually taken 
to be L 2 (R) [not L 2 (R 2 )]. In that case, if the function / in Axiom 2 is 
the position function, f{x 7 p) = 2 , then the associated operator / is the 
position operator X 7 given by multiplication by x. If / is the momentum 
function, f(x 7 p) = p , then / is the momentum operator P = —ih d/dx. 

In the physics literature, a function / on the classical phase space is called 
a classical observable , meaning that it is some physical quantity that could 
be observed by taking a measurement of the system. The corresponding 
operator / is then called a quantum observable. 

Axiom 3 If a quantum system is in a state described by a unit vector 
if € H, the probability distribution for the measurement of some observable 
f satisfies 

E{f m ) = (if 7 {f) m if). (3.15) 
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In particular, the expectation value for a measurement of f is given by 

(Vh/V)- (3-16) 


Note that we have adopted the point of view that even in a quantum 
mechanical system, what one is measuring is the classical observable /. 
In the quantum case, however, / no longer has a definite value, but only 
probabilities, which are encoded by the quantum observable / and the 
vector if € H. 

If ip is a nonzero vector in H but not a unit vector, then (3.16) should 
be replaced by 


(if, if) 



where if := if/ ||V’|| is the unit vector associated with if. It is convenient to 
assume that our vectors have been normalized to be unit vectors, simply 
to avoid having to divide by (if, if) in our expectation values. 

Since / is assumed to be self-adjoint and every self-adjoint operator is 
symmetric, Proposition 3.4 tells us that the moments E(f m ), and in partic¬ 
ular the expectation value E(f ), are real numbers. Since / is assumed to be 
self-adjoint and not just symmetric, the spectral theorem (Chaps. 7 and 10) 
will give a canonical way of constructing a probability measure /iA,v> 011 ® 
that may be interpreted as the probability distribution for measurements 
of A in the state if. 

Axiom 3 provides motivation for the idea that two unit vectors that differ 
by a constant represent the same physical state. If if 2 = cifi with |c| = 1, 
then for any operator A, we have 


(if 2 ,Aif 2 ) = (cifi, Acifi) = \cf (ifi,Aifx) = (ifi,Aifi). 


Thus, the expectation values of all observables are the same in the state 
if 2 as in the state if\. 


Notation 3.10 If A is a self-adjoint operator on H and if £ H is a unit 
vector, the expectation value of A in the state if is denoted (A)^ and is 
defined (in light of Axiom 3) to be 

(A)^ = (if,Aif). (3.17) 

Proposition 3.11 (Eigenvectors) If a quantum system is in a state 
described by a unit vector if £ H and for some quantum observable f we 
have fif = A if for some A £ R, then 

E(f m ) = ((/)"*) = A m (3.18) 

for all positive integers m. The unique probability measure consistent with 
this condition is the one in which f has the definite value A, with probabil¬ 
ity one. 
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What the proposition means is that if ip is an eigenvector for /, then 
measurements of / for a particle in the state ip are not actually random, 
but rather always give the answer of A. If ftp = Xip, then (ip, {f) m ip'^ = 
X m {ip,ip) = X m . Thus, by (3.15), we want to find a probability measure /i 
on R. such that 



(3.19) 


for all non-negative integers m. The proposition is claiming that there is 
one and only one such measure, namely the (5-measure at the point A. 

Because / is assumed to be self-adjoint and therefore symmetric, Propo¬ 
sition 3.4 thus tells us that the every eigenvalue for / is real. 

Proof. The relation (3.18) follows from (3.15) and the fact that ftp = 
Xip. Meanwhile, if p is the 5-measure at A, then certainly (3.19) holds. 
Meanwhile, since the mth moment grows only exponentially with m, even 
the most elementary uniqueness results for the moment problem show that 
the 5-measure is the only measure with these moments. (See, e.g., Theorem 
8.1 in Chap. 4 of [18].) ■ 

If, more generally, the state of the system is a linear combination of 
eigenvectors for /, measurements of / will no longer be deterministic. 


Example 3.12 Suppose f has an orthonormal basis {e^} of eigenvectors 
with distinct (real) eigenvalues Xj. Suppose also that ip is a unit vector in 
H with the expansion 

OO 

^ = ^2 a j e j- ( 3 - 20 ) 

3 =1 

Then for a measurement in the state ip of the observable f, the observed 
value of f will always be one of the numbers X j. Furthermore, the probability 
of observing the value Xj is given by 


Prob{/ = Aj} = |ay| 2 . (3.21) 


Assuming that ip is in the domain of (/) m , it is easy to verify that the 
probabilities in (3.21) are consistent with the expectation values given in 
Axiom 3. After all, if ip is given as in (3.20), then we can readily calculate 
that {ip, {f) m ip) equals Y \ a j\ 2 A™, which is nothing but the mth moment 
associated with the probability distribution in (3.21). In general, we can¬ 
not quite derive (3.21) from Axiom 3, since the uniqueness results for the 
moment problem might not apply. Nevertheless, (3.21) is the most natural 
candidate for the probabilities, and we will assume that this formula holds. 

It is not difficult to extend Example 3.12 to the case where the eigenvalues 
are not distinct: For any sequence {A.,} of eigenvalues, the probability of 
observing some value A will be the sum of |aj| 2 over all those values of j 
for which Xj = A. For any self-adjoint operator A, the spectral theorem 
implies that A has either an orthonormal basis of eigenvectors or some 
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continuous analog thereof. In particular, given a self-adjoint operator A 
and a unit vector ip £ H, the spectral theorem will give us a probability 
measure pff on R. that we interpret as describing the probabilities for a 
measurement of A in the state ip. See Proposition 7.17 in the bounded case 
and Definition 10.7 in the unbounded case. 

Axiom 4 Suppose a quantum system is initially in a state ip and that a 
measurement of an observable f is performed. If the result of the measure¬ 
ment is the number A £ R, then immediately after the measurement, the 
system will be in a state ip' that satisfies 

/>' = W- 

The passage from ip to ip' is called the collapse of the wave function. Here 
f is the self-adjoint operator associated with f by Axiom 2. 

Let us assume again that / has an orthonormal basis of eigenvectors {e^} 
with distinct eigenvalues A j. Then we can say, more specifically, that if we 
observe the value A j in a measurement of / (and we will always observe 
one of the Xj’s) then ip' = ej. That is, the measurement “collapses” the 
wave function by throwing away all the components of ip in the direction 
of the efc’s, except the one with k = j. 

This idea of the collapse of the wave function has generated an enormous 
amount of discussion and controversy. One way to look at the situation is 
to think that the wave function ip is not actually the state of the system— 
although we continue to use the standard physics term, “state.” Rather, 
the wave function is the thing that encodes the probabilities for the state of 
the system. The collapse of the wave function is then something similar to 
a conditional probability; the probabilities for future measurements of the 
system should be consistent with the outcome of the measurement we just 
made. Paul Dirac has described the collapse of the wave function as being 
not a discontinuous change in the state of the system, but a discontinuous 
change in our knowledge of the state of the system. 

In any case, Axiom 4 guarantees the following reasonable principle: If 
we measure / and then measure / again a very short time later, the result 
of the second measurement will agree with the result of the first measure¬ 
ment. Thus, immediately after the first measurement, the probabilities for 
a second measurement of / are not those associated with the vector ip, but 
rather those associated with the state ip'. (Since ip' is an eigenvector for / 
with eigenvalue A, Proposition 3.11 tells us that measurements of / in the 
state ip' always give the value of A.) 

Note that Axiom 4 only tells us something about the state of the system 
immediately after a measurement. Following the measurement, the state of 
the system will evolve in time in the usual way (Sect. 3.7). A significant 
time after the measurement, then, the system will probably no longer be 
in the state ip'. 
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Let us conclude this section by considering an example of how one makes 
a measurement of a real-world physical system, namely, the hydrogen atom. 
The Hamiltonian operator H for a hydrogen atom has negative eigenvalues 
of the form 

- ( 3 - 22 ) 

n z 

where R is the Rydberg constant and n = 1, 2,3,... These energies will be 
derived in Chap. 18. Negative eigenvalues are of greater interest than posi¬ 
tive ones, because negative eigenvalues describes states where the electron 
is bound to the nucleus. If an electron is placed into a state having energy 
— R/n \, with m > 1, it will eventually “decay” into a state with lower 
energy, say, —R/n 2 , with ri 2 < n\. (The most readily observed cases are 
those with ri 2 = 2 and n 2 = 1.) In the process of decaying, the electron 
emits a photon, with the energy of the photon being equal to the change 
in energy of the electron, namely, 

-^photon — ~2 - 2 * (3.23) 

n 2 n{ 

Meanwhile, the frequency of the photon is proportional to its energy. Thus, 
by observing the frequency of the emitted photon, one can determine the 
change in energy of the electron and thus determine the values of n\ and ri 2 • 
A general “bound state” of the hydrogen atom (a state in which the 
electron is bound to the nucleus), will be a linear combination of eigenvec¬ 
tors for H with various different eigenvalues of the form (3.22). To measure 
the energy of the electron, we simply wait for the electron to decay into a 
lower-energy state and emit a photon, observe the frequency of the photon, 
and work backwards to the energy of the electron. If we consider many 
“identically prepared” electrons, all having the same wave function that 
is a linear combination of eigenvectors, we will observe many different fre¬ 
quencies for the emitted photons, and thus many different energies for the 
electron. The probabilities for the observed energies of the electron will 
follow the principle spelled out in Example 3.12. 

In basic probability theory, if Y is a random variable then the variance 
a 2 of Y is computed as 


a 2 = E [(Y - E(Y)) 2 ] , 

where E denotes the mean or expectation value of a random variable. The 
standard deviation o := is a measure of the “typical” deviation from 
the mean E{X). Observe that the variance may be computed as 

a 2 = E [Y 2 - 2 E(Y)Y + E(Y) 2 ] 

= E(Y 2 ) - 2 E(Y) 2 + E(Y) 2 
= E(Y 2 ) - E(Y ) 2 . 


(3.24) 
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Definition 3.13 If A is a self-adjoint operator on a Hilbert space H and 
if is a unit vector in H, let Adenote the standard deviation associated 
with measurements of A in the state if, which is computed as 



We refer to A ^A as the uncertainty of A in the state if. 

For any single observable A, it is possible to choose if so that A ^A 
is as small as we like. In Chap. 12, however, we will see that when two 
observables A and B do not commute, then A^A and A^B cannot both 
be made arbitrarily small for the same if. In particular, we will derive there 
the famous Heisenberg uncertainty principle , which states that 


(A^X)(A^,P) > ^ 


for all if for which A^X and A^P are defined. 

3.7 Time-Evolution in Quantum Theory 

3. 7 .1 The Schrodinger Equation 

Up to now, we have been considering the wave function if at a fixed time. 
We now consider the way in which the wave function evolves in time. Recall 
that in the Hamiltonian formulation of classical mechanics (Sect. 2.5), the 
time-evolution of the system is governed by the Hamiltonian (energy) func¬ 
tion H , through Hamilton’s equations. According to Axiom 2, there is a 
corresponding self-adjoint linear operator H on the quantum Hilbert space 
H, which we call the Hamiltonian operator for the system. See Sect. 3.7.4 
for an example. 

Recall that we motivated the definition of the momentum operator by 
the de Broglie hypothesis , p = hk , where k is the spatial frequency of the 
wave function. We can similarly motivate the time-evolution in quantum 
mechanics by a similar relation between the energy and the temporal fre¬ 
quency of our wave function: 


(3.25) 


E = Hu. 


This relationship between energy and temporal frequency is nothing but the 
relationship proposed by Planck in his model of blackbody radiation (Sect. 
1.1.3). Suppose that a wave function ifo has definite energy E, meaning 
that ifo is an eigenvector for H with eigenvalue E. Then (3.25) means that 
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the time-dependence of the wave function should be purely at frequency 
oj = E/h. That is to say, if the state of the system at time t = 0 is ipo, then 
the state of the system at any other time t should be 


ip{t) = e~ iut ip 0 = e~ iEt / h ip o 


We can rewrite (3.26) as a differential equation: 


dip 

dt 



E 

ih 




(3.26) 


(3.27) 


Note that we are taking “temporal frequency w” to mean that the time- 
dependence is of the form e ~ lult , whereas we took “spatial frequency fc” to 
mean that the space-dependence is of the form e lkx , with no minus sign in 
the exponent. This curious convention is convenient when we look at pure 
exponential solutions to the free Schrodinger equation (Chap. 4) of the form 
exp[i(kx — wt)], which describes a solution moving to the right with speed 
u/k. 

Equation (3.27) tells us the time-evolution for a particle that is initially 
in a state of definite energy, that is, an eigenvector for the Hamiltonian 
operator. A natural way to generalize this equation is to recognize that Eif ; 
is nothing but Hip, since ip is just a multiple of ipo , which is an eigenvector 
for H with eigenvalue E. Replacing E by H in (3.27) leads to the following 
general prescription for the time-evolution of a quantum system. 


Axiom 5 The time-evolution of the wave function ip in a quantum system 
is given by the Schrodinger equation, 


dip 

dt 


4 HiP . 

in 


(3.28) 


Here H is the operator corresponding to the classical Hamiltonian H by 
means of Axiom 2. 

Although both Hamilton’s equations and the Schrodinger equation 
involve a Hamiltonian, the two equations otherwise do not seem parallel. 
Of course, since quantum mechanics is not classical mechanics, we should 
not expect the two theories to have the same time-evolution. Neverthe¬ 
less, we might hope to see some similarities between the time-evolution of 
a classical system and that of the corresponding quantum system. Such 
a similarity can be seen when we consider how the expectation values of 
observables evolve in quantum mechanics. 


Proposition 3.14 Suppose ip{t) is a solution of the Schrodinger equation 
and A is a self-adjoint operator on H. Assuming certain natural domain 
conditions hold, we have 




ip(t) 


(3.29) 
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where (A)^ is as in Notation 3.10 and where [■, ■] denotes the commutator, 
defined as 

[A, B] = AB - BA. 

Equation (3.29) should be compared to the way a function / on the clas¬ 
sical phase space evolves in time along a solution of Hamilton’s equations: 
df/dt = {f,H}. We see, then, that the commutator of operators (divided 
by ih) plays a role in quantum mechanics similar to the role of the Poisson 
bracket in classical mechanics. 

Proof. Let ip(t) be a solution to the Schrodinger equation and let us com¬ 
pute at first without worrying about domains of the operators involved. If 
we use the product rule (Exercise 1) for differentiation of the inner product, 
we obtain 

= r (• Hip, At/)^ - r (^,AHif^ 

= f h ft.\A,H\3). 

where in the last step we have used the self-adjointness of H to move it 
to the other side of the inner product. Recall that we are following the 
convention of putting the complex conjugate on the first factor in the inner 
product, which accounts for the plus sign in the first term on the second 
line. Rewriting this using Notation 3.10 gives the desired result. 

If A and H are (as usual) unbounded operators, then the preceding 
calculation is not completely rigorous. Since, however, we are deferring a 
detailed examination of issues of unbounded operators until Chap. 9, let 
us simply state the conditions needed for the calculation to be valid. For 
every f £ I, we need to have ip(t) £ Dorn(A) fl Dom (H), we need Aip(t) £ 
Dom (H), and we need £ Dom(A). (These conditions are needed for 

[A, H]ip(t) to be defined.) In addition, we need Aip(t) to be a continuous 
path in H. ■ 

Note that to see interesting behavior in the time-evolution of a quantum 
system, there has to be noncommutativity present. If all the physically 
interesting operators A commuted with the Hamiltonian operator H, then 
[H, A] would be zero and the expectation values of these operators would 
be constant in time. Noncommutativity of the basic operators is therefore 
an essential property of quantum mechanics. In the case of a particle in 
R 1 , noncommutativity is built into the commutation relation for X and P, 
given in Proposition 3.8. 

Although it is not reasonable to have all physically interesting opera¬ 
tors commute with H , there may be some operators with this property. If 
[A,H\ = 0, then the expectation value of A (and, indeed, all the moments 
of A) is independent of time along any solution of the Schrodinger equation. 
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We may therefore call such an operator A a conserved quantity (or constant 
of motion). Just as in the classical setting, conserved quantities (when we 
can find them) are helpful in understanding how to solve the Schrodinger 
equation. 

Proposition 3.14 suggests that the map 


1 



where A and B are self-adjoint operators, plays a role similar to that of the 
Poisson bracket in classical mechanics. This analogy is supported by the 
following list of elementary properties of the commutator, which should be 
compared to the properties of the Poisson bracket listed in Proposition 2.23. 

Proposition 3.15 For any vector space V over C and linear operators A, 
B , and C on V, the following relations hold. 


1. [A, B + aC] = [A, B) -f a[A, C} for all a € C 

2. [B,A]=-[A,B\ 


3. [A, BC} = [A, B]C + B[A , C\ 

4■ [A,[B,C]] = [[A,B],C] + [B,[A,C ]] 
Property 4 is equivalent to the Jacobi identity , 


[A,[B,C\] + [B, [C,A]] + [C, [A,B]] = 0, 


(3.30) 


as can easily be seen using the skew-symmetry of the commutator. 

Proof. The first two properties of the commutator are obvious, and the 
third is easily verified by writing things out. Property 4 can also be proved 
by writing things out, but it is slightly messier. Each of the three double 
commutators on the left-hand side of (3.30) generates four terms, for a total 
of 12 terms. Each term has the operators A, B, and C multiplied together 
in some order. It is a straightforward but unenlightening calculation to 
verify that each of the six possible orderings of A, B, and C occurs twice, 
with opposite signs. ■ 

If A and B are bounded self-adjoint operators on some Hilbert space, 
then it is straightforward to check that (1 / {ih))[A, B\ is again self-adjoint 
(Exercise 3). If A and B are unbounded self-adjoint operators, then the 
operator (1 /(ih))[A,B\ will be self-adjoint under suitable assumptions on 
the domains of A and B. 


Proposition 3.16 If f>{f) and ip{t) are solutions to the Schrodinger equa¬ 
tion (3.28), the quantity is independent of t. In particular, 

||'0(f)|| is independent oft, for any solution ip(t) of the Schrodinger equation. 
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Proof. Using again the product rule, we have 

= ^ (</>(*)> #^(*)) 

Since H is self-adjoint, we can move H to the other side of the inner product 
and the derivative is equal to 0. ■ 

3.7.2 Solving the Schrodinger Equation by Exponentiation 

The Schrodinger equation is an example of a equation of the form 

^ = Av > ( 3 - 31 ) 

where A is a linear operator on a Hilbert space. (In the Schrodinger case, 
we have A = Let us think of (3.31) in the case where the Hilbert 

space is the finite-dimensional space C™. In that case, we can think of A as 
an n x n matrix, in which case (3.31) is the sort of equation encountered 
in the elementary theory of ordinary differential equations. The solution of 
this system (in the finite-dimensional case) can be expressed as 

v(t) = e tA v 0 , 

where the matrix exponential e tA is defined by a convergent power series 
and where vq = r>(0) is the initial condition. If A is diagonalizable, then 
the exponential can by computed by using a basis of eigenvectors. (See 
Sect. 16.4 for more information.) 

The Schrodinger equation simply replaces C” by a Hilbert space H and 
the matrix A by the linear operator — ( i/h)H. 

Claim 3.17 Suppose H is a self-adjoint operator on H. If a reasonable 
meaning can be given to the expression then the Schrodinger equa¬ 

tion can be solved by setting 

V’(t) = e~ ltA/n %f 0 . (3.32) 

To see why the claim should be true, we expect that we can differentiate 
the operator-valued expression e~ ltH / h with respect to t as we would in the 
finite-dimensional case. The differentiation, then, would pull down a factor 
of — iH/h , which would indicate that %f(t) indeed solves the Schrodinger 
equation. Furthermore, when t = 0, e~ ltH ^ n should be equal to /, so that 
^(0) is indeed ipo. 

If H is a bounded operator (which is rarely the case), then the expo¬ 
nential e ~ ltH / n can be defined by a convergent power series, precisely as 
in the finite-dimensional case. In that case, Claim 3.17 is an easily proved 
theorem. 
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In the more typical case where H is unbounded, convergence of the series 
for the exponential is a rather delicate matter, and it is better instead to 
use the spectral theorem. We leave a general discussion of the spectral 
theorem to Chaps. 7 and 10, and here consider only the case of a pure 
point spectrum. A (possibly unbounded) self-adjoint operator H is said to 
have a pure point spectrum if there exists an orthonormal basis {e^} for H 
consisting of eigenvectors for H. If Hej = Ej e 7 - for some Ej £ R, then the 
exponential can be defined by requiring that 


e ~ it H/n e _ _ e ~ itE j/ h e ._ 


(3.33) 


The operator e ~ ltH / n i s unitary and thus bounded; it is the unique bounded 
operator on H satisfying (3.33). 

It is not precisely true that every self-adjoint operator has an orthonor¬ 
mal basis of eigenvectors, even if the operator is bounded. Nevertheless, 
given a self-adjoint operator A, the spectral theorem tells us that there is a 
decomposition of H into “generalized eigenspaces” for A. It is, however, a 
bit complicated to state the precise sense of this decomposition, especially 
in the case of unbounded operators. Still, Claim 3.17 allows us to identify 
one goal for the spectral theorem: Whatever the spectral theorem says, it 
ought to allow us to make sense of the expression e laA , for any self-adjoint 
operator A and real number a. This goal will indeed be realized, in the 
bounded case in Chap. 7 and in the unbounded case in Chap. 10. 

We should add two points of clarification regarding the expression (3.32). 
First, in writing (3.32), we have not “really” solved the Schrodinger equa¬ 
tion. For this expression to be useful, we need to compute e ~ ltH / n i n some 
relatively explicit way. If, for example, we can actually compute an or¬ 
thonormal basis of eigenvectors for IT, then in light of (3.33), we are on 
our way to understanding the behavior of the operator e - ltH l n _ Second, 
although H is an unbounded operator, which is not defined on all of H 
but only on a dense subspace, the operator e~ ltH / h is unitary and de¬ 
fined on all of H. Thus, the right-hand side of (3.32) makes sense for any 
tpo in H. Nevertheless, we cannot expect that actually solves the 

Schrodinger equation (in the natural Hilbert space sense) unless o belongs 
to the domain of H. (See Lemma 10.17 in Sect. 10.2.) 


3.7.3 Eigenvectors and the Time-Independent Schrodinger 
Equation 

As we saw in the preceding section, eigenvectors for the Hamiltonian oper¬ 
ator are of great importance in solving the Schrodinger equation. In light 
of this fact, we make the following definition. 
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Definition 3.18 If H is the Hamiltonian operator for a quantum system, 
the eigenvector equation 

Hip = Eip, Ee R, (3.34) 

is called the time-independent Schrodinger equation. 

As always in eigenvector equations, we are trying to determine both the 
numbers E for which (3.34) has a nonzero solution (the eigenvalues) and the 
corresponding vectors if (the eigenvectors). When quantum texts speak of 
“solving,” say, the quantum harmonic oscillator, what they usually mean is 
finding all of the solutions to the time-independent Schrodinger equation. 
(See, e.g., Chaps. 5 and 11.) If if is a solution to the time-independent 
Schrodinger equation, then the solution to the time- dependent Schrodinger 
equation with initial condition ip is simply ip iff) = e~ ltE ^ip. Since ip{t) is 
just a constant multiple of ip, we see that ip{t) represents the same physical 
state as ip. Thus, a solution to the time-independent Schrodinger equation 
is sometimes called a stationary state. 

3.7.4 The Schrodinger Equation in M 1 

Let us now consider the simplest example for the Hamiltonian operator 
H. For a particle moving in R 1 , recall (Sect. 3.5) that we have identified 
the position operator X as being multiplication by x and the momentum 
operator as P = —ih d/dx. The classical Hamiltonian for such a particle 
is typically taken to be of the form H(x,p) = p 2 /{2m) + V{x), where V is 
the potential energy function. In that case, we may reasonably take 

A = ^ + v ( a'). 

Here the operator V(X) is simply multiplication by the potential energy 
function V(x). (This operator may also be thought of as the function V 
applied to the operator X in the sense of the functional calculus coming 
from the spectral theorem.) We see, then, that 

Hip{x) = + V(x)ip(x). (3.35) 

An operator of the form (3.35), or an analogously defined operator in higher 
dimensions, is referred to as a Schrodinger operator. (The term Hamilto¬ 
nian operator refers more generally to whatever operator governs the time- 
evolution of a quantum system, regardless of its form.) 

If our Hamiltonian is of the form given in (3.35), then the time-dependent 
Schrodinger equation takes the form 

x)ip{x,t), 


dip{x,t) ih d 2 ip(x,t) i 

= -V{ 


dt 


2m dx 2 


(3.36) 




3.7 Time-Evolution in Quantum Theory 77 


which is a linear partial differential equation. By contrast, Newton’s 
equation for a particle in K 1 is a typically nonlinear ordinary differential 
equation. 

For a particle in R 1 , the time-independent Schrodinger equation is an 
ordinary differential equation, one that is linear but that has nonconstant 
coefficients, unless V happens to be constant. For simple examples of the 
potential function V, there are relatively standard methods of ordinary 
differential equations that can be brought to bear on the time-independent 
Schrodinger equation. 


3.7.5 Time-Evolution of the Expected Position 
and Expected Momentum 

Since a quantum particle does not have a fixed position or momentum, it 
does not make sense to ask whether the particle satisfies Newton’s equation. 
It does, however, make sense to ask whether the expected values of the po¬ 
sition and momentum satisfy Newton’s equation (in the form of Hamilton’s 
equations). 


Proposition 3.19 Suppose ip(t) is a solution to the Schrodinger equa¬ 
tion (3.36) for a sufficiently nice potential V and for a sufficiently nice 
initial condition ip( 0) = ifo- Then the expected position and expected mo¬ 
mentum in the state if(t) satisfy 


dt < X W) ^ m < p )m (3.37) 

J t (P) m = ~(V'(X)) m . (3.38) 


The assumptions in the proposition are there for two reasons: First, to en¬ 
sure that H is actually a self-adjoint operator (see Sect. 9.9) and second, to 
ensure that the domain assumptions in Proposition 3.14 are satisfied. If we 
assume, for example, that V(x) is a bounded-below polynomial in x and 
that if o belongs to the Schwartz space (A. 15), then both of these concerns 
will be taken care of. Once these technicalities are addressed, the proof of 
Proposition 3.19 is a straightforward application of Proposition 3.14; see 
Exercise 4. Note that (3.37) says that in a certain sense, the velocity of a 
quantum particle is 1/m times the momentum, just as in the classical case. 

At first glance, it might appear that the pair ((X)^,^ , (P)^^) is a solu¬ 
tion to Hamilton’s equations, and indeed (3.37) is precisely what Hamilton’s 
equations require. To get a solution to Hamilton’s equations, however, we 
would need the right-hand side of (3.38) to equal —V'((X)^, t f). But in 
general, 

(V'(X))^ # V'((X)^). 


Consider, for example, the case V'(x) = x 3 + x 2 . If -0 is an even func¬ 
tion, then (X)^ = 0 and so V'((X)^) = 0. But (X 3 + X 2 )^ will not be 
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zero, because the X 3 term will be zero and the X 2 term will be positive. 
We conclude, then, that {X)^, t ^ and {P)^n\ usually do not evolve along 
solutions to Hamilton’s equations. 

There is, however, one case in which {V'(X))^ coincides with V'({X),), 
and that is the case in which V is quadratic, in which case V' is linear. In 
that case we have 

(V'(X))^ = (aX + bl)j, = a (X)^ + b = V'({X)^). 

Thus, the expected position and expected momentum do follow classical 
trajectories in the case of a quadratic potential. It is not surprising that 
this case is special in quantum mechanics, since it is also special in classical 
mechanics; this is the case in which Newton’s law is a linear differential 
equation. 

Although the expected position and expected momentum do not (in gen¬ 
eral) exactly follow classical trajectories, they will do so approximately un¬ 
der certain conditions. If the wave function if(x) is concentrated mostly 
near a single point x = xq, then (V'(X))^ and V'((X)^) will both be 
approximately equal to V'(xq). In that case, the expected position and 
expected momentum of the particle will approximately follow a classical 
trajectory, at least for as long as the wave function remains concentrated 
near a single point. 


3.8 The Heisenberg Picture 

The “Heisenberg picture” of quantum mechanics is based on Heisenberg’s 
matrix model of quantum mechanics (Sect. 1.3). In the Heisenberg picture, 
one thinks of the operators (quantum observables) as evolving in time, while 
the vectors in the Hilbert space (quantum states) remain independent of 
time. This is to be contrasted with the approach to quantum mechanics 
we have been using up to now (the “Schrodinger picture”), in which the 
observables are independent of time and the states evolve in time. 

Definition 3.20 In the Heisenberg picture, each self-adjoint operator A 
evolves in time according to the operator-valued differential equation 

*^p- = L[A(t),H], (3.39) 

where H is the Hamiltonian operator of the system, and where [•,•] is the 
commutator, given by [A, B] = AB — BA. 

Note that since H commutes with itself, the operator H remains constant 
in time, even in the Heisenberg picture. This observation is the quantum 
counterpart to the fact that the classical Hamiltonian H remains constant 
along a solution of Hamilton’s equations. 
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Given the self-adjoint operator H, the spectral theorem will give us a way 
to construct a family of unitary operators e~ ltH / h , 1 £ R, and this family of 
operators computes the time-evolution of states in the Schrodinger picture 
(Sect. 3.7.2). It is easy to check (at least formally) that the solution to 
(3.39) can be expressed as 

A(t) = e itH/h Ae -itH/n_ (3. 40 ) 

Now, if ip is the state of the system (now considered to be independent of 
time), then the expectation of Apt) in the state ip is defined to be ( Apt))^ = 
(ip,A(t)ip). We may then compute that 

(A(t))^ = (ip,e itA / K Ae~ itfl / K iP^ 

= (e~ itfl/n ip,Ae~ itfllh ip'j 
= (ip(t),Aip(t )), 

where ip(t) is time-evolved state of the system in the Schrodinger picture. 
Here, we have used that the adjoint of e ltH l n is which is formally 

clear and which is a consequence of the spectral theorem. 

Note that in the Schrodinger picture, (ip(t), Aip(t)) is the expectation 
value of A in the state ip(t). We conclude, then, that the Heisenberg picture 
and the Schrodinger picture give rise to precisely the same expectation 
values for observables as a function of time, and are therefore physically 
equivalent. Although we will work primarily with the Schrodinger picture of 
quantum mechanics, the Heisenberg picture is also important, for example, 
in quantum field theory. 

Proposition 3.21 Suppose H — P 2 / (2m)+ V(X), where V is a bounded- 
below polynomial. Then for any t £l we have 

H=^(P(t)) 2 + V(X(t)). (3.41) 

Note that since [H,H\ = 0, the Hamiltonian H is independent of time, 
even in the Heisenberg picture. Thus, the right-hand side of (3.41) is ac¬ 
tually independent of t, even though P(t) and X(t) depend on t. Equa¬ 
tion (3.41) holds also for sufficiently nice nonpolynomial functions V, but 
some limiting argument would be required in the proof. The assumption 
that V be bounded below is to ensure that H is actually an (essentially) 
self-adjoint operator; compare Sect. 9.10. 

Lemma 3.22 Suppose A is a self-adjoint operator on H and that A(-) is 
a solution to (3.39) with A(0) = A. Then for any positive integer m , the 
map 

t^(A(t)) m 

is also a solution to (3.39). 
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That is to say, the time-evolution of the mth power of A is the same as 
the mth power of the time-evolution of A; that is, A m (t) = (A(t)) m . 
Proof. If we use (3.40), then the result holds because 

^itH/h j^rn^—itH/h ^itH/h /h ^itH/h /h . . _ ^itH / h / H 

= (e it6/K Ae- itA/ti ^ m . 

It is also easy to check that A(t) m satisfies the differential equation (3.39). 


With this lemma in hand, it is easy to prove the proposition. 

Proof of Proposition 3.21. On the one hand, since [H,H\ = 0, the 
time-evolved operator H(t) is simply equal to H. On the other hand, if we 
time-evolve P 2 /(2m) + V(X) using Lemma 3.22, we obtain the expression 
on the right-hand side of (3.41). ■ 

Proposition 3.23 Suppose the Hamiltonian of a quantum system is as 
in Proposition 3.21. Then the operators X(t) and P(t) defined by (3.39) 
satisfy the following operator-valued differential equation: 


dX 

dt 

dP 

dt 


-P(t) 


m 


-V\X(t)). 


(3.42) 


Proof. See Exercise 7. ■ 

Proposition 3.23 means that the operator-valued functions X(t) and P(t) 
satisfy the operator analogs of the classical equations of motion dx/dt = 
p(t)/m and dp/dt = —V'(x(t)). Nevertheless, the expectation values of X(t) 
and P(t) do not satisfy the ordinary equations of motion, as we have already 
seen by calculating in the Schrodinger picture. If we take expectation values 
in the system (3.42), we get the same answer as in Proposition 3.19, namely, 

j t < m )>* = - <m(t))>„ • 

These are not the classical equations of motion, unless the expectation value 
of the operator V'(X(t)) coincides with V' applied to the expectation value 
of X(t), which is usually not the case. 


3.9 Example: A Particle in a Box 

Let us consider quantum mechanics in one space dimension for a particle 
that is confined to move in a “box,” which we describe as the interval 
0 < x < L. Our goal is to find all of the eigenvectors and eigenvalues of 
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the Schrodinger operator, that is, to find solutions of the time-independent 
Schrodinger equation Hip = Eip. In solving this equation, we may think of 
the constraint to the box as follows. Imagine a particle moving in R 1 in the 
presence of a potential V that is 0 for x between 0 and L and takes some 
very large constant value C on the rest of the real line. Classically, this 
would mean that the particle has to have very high energy (greater than 
C) to escape from the box. Quantum mechanically, if we have a solution 
of the time-independent Schrodinger equation Hip = Eip for this potential 
(with E -C C), then we expect ip to decay rapidly for x outside of the box. 
(We will see this behavior explicitly in Chap. 5.) In the limit as C tends to 
infinity, we expect solutions of the time-independent Schrodinger equation 
to be zero outside the box and to tend to zero as we approach the ends of 
the box. 

The upshot of this discussion is that we are looking for smooth functions 
ip on [0, L\ that satisfy the differential equation 

h 2 d 2 ih 

~ = Eip(x), 0 <x<L (3.43) 

2m ax z 

and the boundary conditions 


m = ip(L) = o. 


(3.44) 


For E > 0, the solution space to (3.43) will be the span of two complex 
exponentials, or equivalently a sine and a cosine function: 


. (V2mE N 

ip(x) = asm I —-— x 


, ( \/2 mE N 

O COS --- X 

V h J 


(3.45) 


If we now impose the boundary condition ^>(0) = 0, we get that 5 = 0, 
leaving only the sine term. If we then impose the condition ip(L) = 0, we 
will obtain a = 0—which would mean that ip is identically zero— unless 




(3.46) 


Since we are interested in solutions to (3.43) where ip is not identically 
zero, we want (3.46) to hold. Thus, the argument of sine function must be 
an integer multiple of n. This condition imposes a restriction on the value 
of E, namely that E should be of the form 


E r .= 


j 2 TT 2 h 2 
2 mL 2 


(3.47) 


for some positive integer j. 

It is a simple exercise (Exercise 8) to verily that for E < 0, the only 
solution to (3.43) satisfying the boundary conditions (3.44) is the one with 
ip identically zero. 
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Proposition 3.24 The following functions are solutions to (3.43) 
satisfying the boundary conditions (3.44) ■' 





j = 1,2,3,..., 


and the corresponding eigenvalues Ej are given by (3-47). The functions 
%l>j form an orthonormal basis for the Hilbert space L 2 ([0,L]). 


Proof. We have already verified the equation and eigenvalue for each ipj. 
It is a simple computation to verify that the ipj’s are orthonormal, and the 
elementary theory of Fourier series (Fourier sine series, in this case) shows 
that the ipj’s form an orthonormal basis for L 2 ([0, L]). m 

The Hamiltonian operator for this problem (in which V = 0 inside the 
box) is given by 


tt , _ h 2 
^ 2 TO dx 2 ' 

This operator is an unbounded operator and is not defined on the whole 
Hilbert space L 2 ([ 0, L]), but only on a dense subspace Dom(iJ) C L 2 ([ 0, L ]). 
The domain of H should be chosen in such a way that H is essentially self- 
adjoint and, thus, symmetric (Sect. 3.2), meaning that 




(3.48) 


for all (f, ip in Dom(ft). For (3.48) to hold, (p and ip must satisfy appro¬ 
priate boundary conditions, which will allow the boundary terms in the 
integration by parts to be zero. (See Exercise 9.) 

Mathematically, then, it is necessary to impose some boundary condi¬ 
tions in order for H to be an essentially self-adjoint operator. The particular 
choice of boundary conditions (3.44) is based on the idea of approximating 
the box by a very large “confining” potential outside the box. See Chap. 9 
for an extensive discussion of domain issues for unbounded operator. 


3.10 Quantum Mechanics for a Particle in 

Up to this point, we have been considering a quantum particle moving 
in M 1 . It is straightforward, however, to generalize to a quantum particle 
moving in M ra . The Hilbert space for a particle in R n is L 2 (R n ), rather than 
L 2 (R). Instead of single position operator, we have n such operators, given 

by 

Xjip(x) = Xjipfx.), j = 1,..., n. 

Similarly, we have n momentum operators, given by 
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As in the R 1 case, Xj does not commute with Pj but satisfies [Xj, Pj\ = 
ihl. On the other hand, Xj commutes with X k and Pj commutes with P k . 
Furthermore, Xj commutes with P k for j ^ k. These formulas are referred 
to as the canonical commutation relations. 


Proposition 3.25 (Canonical Commutation Relations) The position 
and momentum operators satisfy 

ih [Xj ’ Xk] = ° 

l[P„P fc ]= 0 

±{Xj,P k \=6 jk I (3.49) 

for all 1 < j, k < n. 


These relations are the quantum counterparts of the Poisson bracket rela¬ 
tions among the position and momentum functions in classical mechanics. 
Specifically, the role of the Poisson bracket in Proposition 2.24 is played in 
Proposition 3.25 by the quantity (1 /(ih))[-, •]. 

If the classical Hamiltonian for a particle in R” is of the usual form 
(kinetic energy plus potential energy), then we may analogously define the 
Hamiltonian operator to be of the form 

n p2 

= + < 3 - 50 > 

3=1 


where V (X) denotes the result of applying the function V to the commuting 
family of operators X = (X 1; ... ,X n ). It it natural to identify P(X) with 
the operator of multiplication by the function P(x). In that case, we may 
write H more explicitly as 

= --^-A V>(x) + P(x)V>(x), 

2m 

where A is the Laplacian, given by 


n 


3=1 


dx\ 


We refer to an operator of the form (3.50) as a Schrodinger operator. 

We may also introduce angular momentum operators defined by analogy 
to the classical angular momentum functions. 


Definition 3.26 For each pair (j, k) with 1 < j,k < n, define the angular 

momentum operator Jjk by the formula 


Jjk = XjP k — X k Pj. 
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As in the classical case, we have Jjk = 0 when j = k. When j ^ k , Xj 
and Pk commute, so the order of the factors in the definition of Jjk is not 
important. Explicitly, we have 



The operator in parentheses is the angular derivative (d/d6) in the ( Xj,Xk ) 
plane. 

When n = 3, it is customary to use the quantum counterpart of the 
classical angular momentum vector , namely, 


:= X 2 P 3 - X 3 P 2 - J 2 := X 3 P : - J 3 ~X 1 P 2 -X 2 P 1 . (3.51) 


When n = 3, every Jjk with j ^ k is one of the above three operators or 
the negative thereof. 

3.11 Systems of Multiple Particles 

Suppose now we have a system of N quantum particles moving in R n . If the 
particles are all of different types (e.g., one electron and one proton), then 
the Hilbert space for this system is L 2 (M. nN ). That is, the wave function 
ip of the system is a function of variables x^x 2 ,..., x^, with each x J 
belonging to R n . If we normalize ip to be a unit vector in L 2 (M. nN ), then 
IV’CMx 2 ,... , x w )| 2 is to be interpreted as the joint probability distribution 
for the positions of the N particles. 

We may introduce position operators X J k (the fcth component of the 
position of the jth particle) and momentum operators P k in obvious anal¬ 
ogy to the definition for a single particle. The typical Hamiltonian operator 
for such a system is then 



where rrij is the mass of the jth particle. Here A j means the Laplacian 
with respect to the variable x J £ R”, with the other variables fixed. 

As we will see in Chap. 19, the Hilbert space for a composite system, 
made up of various subsystems, is typically taken to be the (Hilbert) tensor 
product of the individual Hilbert spaces. In the present context, we may 
think of our system of being made up of N subsystems, each being one of the 
individual particles. Fortunately, there is a natural isomorphism (Proposi¬ 
tion 19.12) between L 2 (R raJV ) and the tensor product of N copies of R”, 
so that the approach we are taking here is consistent with the general 
philosophy. 
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If the particles in question are identical (say, all electrons), then there 
is an additional complication to the description of the Hilbert space for 
the system. In standard quantum theory, we are supposed to believe that 
“identical particles are indistinguishable.” What this means is that the wave 
function should have the property that if we interchange, say, x 1 with x 2 , 
then the new wave function should represent the same physical state as 
the original wave function. Recalling that two unit vectors in the quantum 
Hilbert space represent the same physical state if and only if they differ by 
a constant of absolute value 1, this means we should have 


if(x. 2 ,x x ,x 3 ,... ,x w ) = uiffx 1 ,*: 


,N 


), 


for some constant u with |rt| = 1. Applying this rule twice gives that if is 
u 2 if, so evidently u must be either 1 or —1. 

Particles in quantum mechanics are grouped into two types, according 
to whether the constant u in the previous paragraph is 1 or —1. Particles 
with u = 1 are called bosons and particles with u = — 1 are called fermions. 
Whether a particle is a boson or a fermion is determined by the spin of the 
particle, a concept that we have not yet introduced. Nevertheless, we can 
say that particles without spin are bosons. For a collection of N identical 
spinless particles moving in M 3 , the proper Hilbert space is the symmetric 
subspace of L 2 (R 3N ), that is, the space of functions in L 2 (R 3N ) that are 
invariant under arbitrary permutations of the variables. We will have more 
to say about spin and systems of identical particles in Chaps. 17 and 19. 


3.12 Physics Notation 

In quantum mechanics, physicists almost invariably use the Dirac nota¬ 
tion (or bra-ket notation) introduced by Dirac in 1939 [5]. This notation 
is made up of Notations 3.27-3.29 below. In this section, we explore the 
Dirac notation along with a few other notational differences between the 
mathematics and physics literature. 

Before proceeding it is important to point out that when using Dirac 
notation, it is essential that the complex conjugate in the inner product 
should go on the first factor. 

Notation 3.27 A vector if in H is referred to as a ket and is denoted 
\if). A continuous linear functional on H is called a bra. For any cf £ H, 
let (cf | denote the bra given by 

{cj)\ (if) = {(j),ip) . 

That is to say, ( <j>\ is the “inner product with <f>” functional. The bracket 
(or bra-ket) of two vectors <f,if G H is the result of applying the bra (<f\ to 
the ket \if) , namely the inner product of the <f and if, denoted (<f\if ). 
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If A is an operator on H and <f> is a vector in H, then we can form 
the linear functional {<p\A, i.e., the linear map ip H>• {<p\Aip). Physicists 
generally write an expression of this form as 

(<t>\ A \ip). 

This notation emphasizes that there are two different ways of thinking of 
this quantity. We may think of {<p \A\ ip) either as the linear functional 
(<p\ A applied to the vector l^}, or as the linear functional {(p\ applied to 
the vector A \ip). 

Notation 3.28 For any <p and ip in H, the expression \<p){ip\ denotes the 
linear operator on H given by 

(I^XV’I) (x) = \<t>){$ lx) = (V’lx) \<t>) ■ 

That is, in mathematics notation, |0}(V’| the operator sending \ to {ip, x) 4>- 

The operator \(p){ip\ associates to each (ket) vector |x) a new vector in 
the only way that makes notational sense: We interpret |0XVilx) as the 
vector | cp) multiplied by the scalar {tp\x) ■ 

Notation 3.29 Given a family of vectors in H labeled by, say, three indices 
n, l, and m, rather than denoting these vectors as \ip n ,i,m), a, physicist will 
denote them simply as \n,l,m). 

This notation is not without its pitfalls. If we have two different sets 
of vectors labeled by the same set of indices, a mathematician can simply 
label them as (pn,i,m and ip n ,i !m , hut the physicist has a problem. 

As an example of the Dirac notation, suppose that an operator H has 
an orthonormal basis of eigenvectors ip n . A physicist would express the 
decomposition of a general vector in terms of this basis as 

I = ]T \n){n\ , (3.52) 

n 

where ip n is represented simply as | n) and where |n)(n| is (given that |n) is 
a unit vector) the orthogonal projection onto the one-dimensional subspace 
spanned by the vector |n). 

Notation 3.30 In the physics literature, the complex conjugate of a com¬ 
plex number z is denoted as z* , rather than z, as in the mathematics liter¬ 
ature. What a mathematician calls the adjoint of an operator and denotes 
by A* , a physicist calls the Hermitian conjugate of A and denotes by A*. 
Physicists refer to self-adjoint operators as Hermitian. 

We may express the concept of an adjoint (or Hermitian conjugate) of 
an operator using Dirac notation, as follows. If A is a bounded operator on 
H, then A^ is the unique bounded operator such that 
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One peculiarity of the physics literature on quantum mechanics is a 
conspicuous failure of most articles to state what the Hilbert space is. 
Rather than starting by defining the Hilbert space in which they are work¬ 
ing, physicists generally start by writing down the commutation relations 
that hold among various operators on the space. Thus, for example, a physi¬ 
cist might begin with position and momentum operators A and P, satis¬ 
fying [A', P] = ihl , without ever specifying what space these operators are 
operating on. The justification for this omission is, presumably, the Stone- 
von Neumann theorem, which asserts that (provided the operators satisfy 
the expected “exponentiated” relations) there is, up to unitary equiva¬ 
lence, only one Hilbert space with operators satisfying these relations and 
on which the operators act irreducibly. (See Chap. 14 for a precise state¬ 
ment of the result.) It is, nevertheless, disconcerting for a mathematician to 
encounter an entire paper full of computations involving certain operators, 
without any specification of what space these operators are operating on, 
let alone how the operators act on the space. 

This practice among physicists represents something of a role reversal. 
In the setting of linear algebra, for example, a mathematician might say, 
“Let V be a n-dimensional vector space over R.” If a physicist says, “Oh, so 
it’s R",” the mathematician will reply, “No, no, you don’t have to choose a 
basis.” By contrast, in quantum mechanics, it is the physicist who does not 
want to choose a particular realization of the space. A physicist will simply 
write down the commutation relations between, say, X and P. If pressed, 
the physicist might say that he is working in an irreducible representation 
of those relations. If a mathematician then says, “Oh, so it’s L 2 (R),” the 
physicist will reply, “No, no, there is no preferred realization.” 

Notation 3.31 Given an irreducible representation of the canonical com¬ 
mutation relations, and given a vector if in the corresponding Hilbert space, 
a physicist will speak of the position wave function if(x ), defined by 

if(x) = (x\if) . (3.53) 

Here, (x| is the bra associated with the ket |x), where |x) is supposed to be 
an eigenvector for the position operator with eigenvalue x. 

See, again, Chap. 14 for the precise notion of “irreducible representa¬ 
tion of the canonical commutation relations.” One may similarly define the 
momentum wave function by taking the inner product of if with the eigen¬ 
vectors of the momentum operator, which are also non-normalizable. See 
Sect. 6.6 for details. 

A mathematician might find Notation 3.31 objectionable on the grounds 
that the operator X does not actually have any eigenvectors. After all, 
it is harmless, in view of the Stone-von Neumann theorem, to work in 
the “Schrodinger representation,” in which our Hilbert space is L 2 (R) and 
the position operator A' is just multiplication by x. Given a number Xg, 
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there is no nonzero element if) of L 2 (R) for which Xip = xotf>. After all, 
any ■)/> satisfying this equation would have to be supported at the point 
x = Xq , in which case ip would equal zero almost everywhere and would be 
the zero element of L 2 (K). A physicist, on the other hand, would say that 
the desired eigenfunction is ip(x) = S(x — Xo), where S is the Dirac delta- 
function.” The fact that 8{x — £o) is not actually in the Hilbert space 
L 2 (R) does not concern the physicist; it is simply a “non-normalizable 
state.” The mathematical theory of such non-normalizable states comes 
under the heading “generalized eigenvectors.” See Sect. 6.6 for a discussion 
of this issue in the case of the eigenvectors of the momentum operator. 

A more subtle issue regarding the “position eigenvectors” is that each 
eigenvector is unique only up to multiplication by a constant. If one wants 
the momentum operator to act on the position wave function, as defined by 
(3.53), in the usual way, one must make a consistent choice of normalization 
of the eigenvectors of the position operators. Specifically, one should choose 
the constants in such a way that the exponentiated momentum operator 
exp (iaP/h) maps |x) to |x + a). 

3.13 Exercises 

1. Suppose that <p(t) and tp(t) are differentiable functions with values in 
a Hilbert space H, meaning that the limit 


d(j) _ <j>(t + h) — (j>(t ) 


dt h-t o h 


exists in the norm topology of H for each t, and similarly for ip(t). 
Show that 



2. Suppose A and B are operators on a finite-dimensional Hilbert space 
and suppose that AB — BA = cl for some constant c. Show that 


c = 0. 


Note : This shows that the commutation relations in (3.8) are a purely 
infinite-dimensional phenomenon. 

3. If A is a bounded operator on a Hilbert space H, then there exists a 
unique bounded operator A* on H satisfying (</>, Aip) = (A*<p,ip) for 
all cf) and ip in H. (Appendix A.4.3.) The operator A* is called the 
adjoint of A , and A is called self-adjoint if A* = A. 

(a) Show that for any bounded operator A and constant c £ C, we 
have (cA)* = cA*, where c is the complex conjugate of c. 
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(b) Show that if A and B are self-adjoint, then the operator 


1 

ih 


[A,B] 


is also self-adjoint. 

4. Verify Proposition 3.19 using Proposition 3.14. Note that the operator 
V' (X) means simply the operator of multiplication by the function 

V'(x). 

5. Suppose that if) is a unit vector in L 2 (R) such that the functions 
xip(x) and x 2 i/j{x) also belong to L 2 (R). Show that 


Hint: Consider the integral 



a) 2 \^{x)\ 2 dx, 


where a = ( X. 


6. Consider the Hamiltonian H for a quantum harmonic oscillator, given 

by 

h 2 d 2 k 


H = — 


+ 


2m dx 2 ' 2 

where k is the spring constant of the oscillator. Show that the function 


[ y/krn 2 
ip 0 {x) = exp < --^-x 


is an eigenvector for H with eigenvalue kio/ 2, where u> \= \Jk/m is 
the classical frequency of the oscillator. 

Note : We will explore the eigenvectors and eigenvalues of H in detail 
in Chap. 11. 

7. Prove Proposition 3.23. 

Hint: Show that [P(t),H] = ([P, H])(t) and [X(t),H] = ([X, H])(t). 

8. (a) Find the general solution to (3.43), where E is a negative real 

number. Show that the only such solution that satisfies the 
boundary conditions (3.44) is identically zero. 

(b) Establish the same result as in Part (a) for E = 0. 
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9. (a) Suppose (p and ip are smooth functions on [0,L] satisfying the 

boundary conditions (3.44). Using integration by parts, show 
that 

= (-H'&V’) > 

where H = —(h 2 /2m) d 2 /dx 2 and where 

= / <f>(x)ip(x ) dx. 

Jo 

(b) Show that the result of Part (a) fails if cp and ip are arbitrary 
smooth functions (not satisfying the boundary conditions). 

10. Let Ji, J 2 , and J 3 be the angular momentum operators for a particle 
moving in M 3 . Using the canonical commutation relations (Proposi¬ 
tion 3.25), show that these operators satisfy the commutation rela¬ 
tions 

This is the quantum mechanical counterpart to Exercise 19 in the 
previous chapter. 



4 

The Free Schrodinger Equation 


In this chapter, we consider various methods of solving the free Schrodinger 
equation in one space dimension. Here “free” means that there is no force 
acting on the particle, so that we may take the potential V to be identically 
zero. Thus, the free Schrodinger equation is 

dt 2m dx 2 ’ 

subject to an initial condition of the form 

ip(x,0) = V>o(U- 

We will identify some key features of solutions to this equation, such as the 
“spread of the wave packet” and the distinction between “phase velocity” 
and “group velocity.” In particular, the notion of group velocity will confirm 
our expectation that a particle of momentum p should travel with velocity 
v = p/m. 

Before attempting to solve the free Schrodinger equation, let us make a 
simple observation about the time evolution of the expected values of the 
position and momentum. If we apply Proposition 3.19 in the case that V 
is identically equal to zero, we have 

j t ( x) m = -( p )m 

dt ^ P F(<) = °- 
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Thus, the expectation value of P is independent of time, which then means 
that the expectation value of X is linear in time: 

= <*>*. + i 

T)^(t) = ( p U 0 ■ 

Thus, the free Schrodinger equation is one of the special cases in which 
the expected values of the position and momentum exactly follow classical 
trajectories (and those classical trajectories are very simple in the case 
V = 0). 


4.1 Solution by Means of the Fourier Transform 


We look for solutions of the free Schrodinger equation on R 1 of the form 




(4.2) 


where k is the frequency in space and u>(k) is the frequency in time, which 
is an as-yet-undetermined function of k. (Of course, such a solution is not 
square-integrable in x for a fixed t, but we will find our way back to square- 
integrable solutions eventually.) Plugging this into (4.1) easily gives the 
formula for uj as a function of k: 

hk 2 

W (fc) = (4.3) 

A formula of this sort, expressing the temporal frequency w as a function of 
the spatial frequency k in a solution of some partial differential equation, 
is called a dispersion relation. 

Observe that (4.2) can be written as 


t) = exp 


ik 



(4.4) 


Now, replacing a function f{x) by f(x — a) has the effect of shifting / to 
the right by a. Thus, the time-evolution has the effect of shifting the initial 
function to the right by an amount equal to (co(k)/k)t. This means that 
the function is moving to the right with speed co(k)/k. This speed, 

for reasons that will be clearer in Sect. 4.3, is called the phase velocity. 

The phase velocity, then, is the speed at which a pure exponential solution 
of our equation (the free Schrodinger equation) propagates. We compute 
the phase velocity as ui(k)/k = hk/(2m). Now, we have said that a wave 
function of the form e lkx represents a particle with momentum p = hk. 
We thus arrive at the following curious conclusion. 
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Proposition 4.1 The phase velocity of a particle with momentum p = hk is 

ui(k) hk p 

phase velocity = —;— = -— = -—. 

k 2m 2m 

This velocity is half the velocity of a classical particle of momentum p. 

Proposition 4.1 might make us think that our basic relation p = hk is 
off by a factor of 2. We will see, however, that the phase velocity, that is, 
the velocity of a pure exponential solution, is not the “real” velocity of a 
particle with momentum p. The real velocity is the “group velocity,” which 
will turn out to be, as expected, p/m. 

Leaving aside for now the question of the velocity, let us build up a 
general solution to (4.1) from solutions of the form (4.2). We make use of 
the Fourier transform, discussed in Appendix A.3. We can then express the 
solution to the free Schrodinger equation, for “nice” initial conditions, as a 
“superposition” of these pure exponential solutions. 


Proposition 4.2 Suppose that ipo is a “nice” function, for example, a 
Schwartz function (Definition A. 15). Let if o denote the Fourier transform 
of ip o and define ip(x,t ) by 

1 r°° 

ip{x,t) = -= Mk)e i{kx ~ ul{k)t) dk, (4.5) 

V 27T J — oo 

where to(k) is defined by (f.3). Then ip(x,t) solves the free Schrodinger 
equation with initial condition tpo- 


The assumption that ip be a Schwartz function is stronger than neces¬ 
sary. The reader is invited to trace through the argument and find suitable 
weaker conditions. 

Proof. Since the Fourier transform of a Schwartz function is a Schwartz 
function, ipo{k) will decay faster than 1/fc 4 as k tends to ±oo. Meanwhile, 
by integrating the derivative of the function e lkx , we obtain the estimate 

^ik(x-\-h) _ ^ikx 


We can then apply dominated convergence, using |fc| ipo(k) as our domi¬ 
nating function, to move a derivative with respect to x under the integral 
sign in the formula for ip(x,t). This derivative pulls down a factor of ik 
inside the integral. The decay of tpo allows us to repeat this argument to 
move a second derivative with respect inside the integral. We can also move 
a derivative with respect to t inside the integral, by a similar argument. 

Since exp{*(fc:r — u>(k)t)} satisfies the Schrodinger equation for each 
fixed k, differentiation under the integral shows that ip(x,t) satisfies the 
Schrodinger equation as well. The Fourier inversion formula shows that 
ip(x,0) = tp 0 (x). u 
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Proposition 4.3 If ip(x,t) is as in Proposition J t .2, then the Fourier 
transform ofip{x,t), with respect to x with t fixed, is given by 



(4.6) 


Proof. We can write (4.5) as 



By the uniqueness of the Fourier decomposition (i.e., the injectivity of the 
inverse Fourier transform, which follows from the Plancherel formula), the 
Fourier transform of ip(x,t) (with respect to x) must be the function in 
square brackets. Putting in the expression (4.3) for oj{k) establishes the 
desired result. ■ 

Now, the Fourier transform is a unitary map from L 2 (R) onto L 2 (R). 
Thus, for any ipo in A 2 (®0> 'ipo also belongs to L 2 (R). Since the quantity 
multiplying ipo(k) in (4.6) has absolute value 1, the right-hand side of (4.6) 
is a well-defined square-integrable function of k , for any xpo in L 2 (R), which 
has a well-defined inverse Fourier transform in L 2 (R). 

Definition 4.4 For any ipo £ L 2 (R), define, for each i £ R, ip(x,t) to be 
the unique element of L 2 (R) that has a Fourier transform (with respect to 
x) given by (f.6). 

Definition 4.4 defines a time-evolution for arbitrary initial conditions 
in L 2 (R). For general ipo £ L 2 (R), however, ip{x,t) may not satisfy the 
Schrodinger equation in the classical, pointwise sense, simply because ip{x, t ) 
may fail to be differentiable, either in x or in t. Nevertheless, ip{x,t), as 
defined by Definition 4.4, always satisfies the Schrodinger equation in the 
weak (distributional) sense. See Exercise 1. 

4.2 Solution as a Convolution 

According to Proposition 4.3, we see that the Fourier transform of the 
time-f wave function is the product of the Fourier transform of ipo and 
the function exp[— ithk 2 /(2m)\. According to Proposition A.21, the inverse 
Fourier transform of a product of two sufficiently nice functions is 1 /v27t 
times the convolution of the two separate inverse Fourier transforms. Here 
the convolution <p * ip of two functions </> and ip is defined to be 



whenever the integral is convergent for all x. 
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Formally, then, we ought to have 

if(x,t) = if 0 * K t , 


(4.7) 


where 

hk 2 t 
i —— 

2 TO 

The problem with is idea is that the function exp[— ithk 2 /(2m)} is not 
a “nice” function in the usual sense. Certainly, this function is not the 
Fourier transform of some function in L^R) D T 2 (R), because if it were, 
then the function would have to tend to zero at infinity (Proposition A. 14). 
Therefore, we cannot directly apply Proposition A.21, even if ifo is in 

P(R)ni 2 (R). 

Fortunately, the desired inverse Fourier transform can be computed as a 
convergent improper integral (Exercise 2), with the following result: 


K t = 




r 


-i 


exp 


K t (x) 


1 

27T 



hk 2 t 
*~2m" 


dk 


i2nht 


exp 



(4.8) 


Here, the square root is the one with positive real part. The function K t 
is called the fundamental solution of the free Schrodinger equation. (See 
Fig. 4.1.) This function does indeed satisfy the free Schrodinger equation, 
as we can easily verify by direct differentiation. 

The preceding discussion should make the following result plausible. 

Theorem 4.5 Suppose ifo G L 2 (R) D L 1 (R). Then if(x,t), as defined by 
(4-5), may be computed for all t ^ 0 as 

= ][^mL 0 , exp {'Wh [x ~ y)2 } Mv) dy ' 


The expression for if(x,t) is (27r) x ^ 2 K t * if q, where K t is as in (4-8). 

Proof. For any set E C R, let 1 e denote the indicator function of E, that 
is, the function that is 1 on £ and 0 elsewhere. Then K t l[-n, n ] belongs to 
T 1 (R) D L 2 (R) for any positive integer n. By Proposition A.21, then, we 
have 

T ({K t l[_ n , n] ) * if 0 ) = V^E(K t l { _ nM )E{if 0 ). (4.9) 

Because ifo is in L 1 (R), it is easy to see that K t l[-n,n] * tfo converges 
pointwise to K t *ifo- On the other hand, using the argument in Exercise 2, 
we can see that J-(K t l[_ n , n ]) is bounded by a constant independent of n 
and converges pointwise to the function 


1 



hk 2 t 

*~2mT 


(4.10) 
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FIGURE 4.1. The real part of Kt(x), for t = 1 (top) and t = 0.2 (bottom). 
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Equation (4.10) is enough to show that the right-hand side of (4.9) 
converges in L 2 (R) to the function 


exp 


hk 2 t 

i —— 

2 m 


V’oM- 


By the Plancherel theorem, K t l^_ nn i*ipo must also be converging in L 2 (R), 
and the L 2 limit must coincide with the pointwise limit, which is I\ t * ip o- 
Thus, taking limits on both sides of (4.9) shows that the Fourier transform 
of Kt * ipo is what we want it to be. ■ 

In general, to be considered the fundamental solution of a certain equa¬ 
tion, a function should converge to a Dirac 5-function (Example A.26), in 
the distribution sense, as t tends to zero. Since |AT t (o;)| is independent of 
x for each t, it might seem doubtful that K t has this property. On the 
other hand, we can see Ktix ) oscillates very rapidly except near x = 0. 
(See Fig. 4.1.) This oscillation causes the integral of K t (x) against some 
nice function ip{x) to be small, except for the part of the integral near 
x = 0. Indeed, because the Fourier transform of K t converges to the con¬ 
stant function \/\/2/k (which is what we get by formally taking the Fourier 
transform of the 5-function) as t tends to zero, it is not hard to show that 
K t does, in fact, converge to a 5-function. The details of this verification 
are left to the reader. 


4.3 Propagation of the Wave Packet: First 
Approach 

Let us consider the Schrodinger equation in K 1 with an initial condition 
ipo that is a “wave packet,” meaning a complex exponential multiplied by 
some function that localizes ipo in space. Specifically, we take 

Mx) = e ipoX / K A 0 (x), (4.11) 

where A 0 is some real, positive function and po is a nonzero real number. 
(The case po = 0 should be treated separately.) We also assume that A 0 is 
“slowly varying” compared to c WoX ^ K , meaning that Aq is approximately 
constant over many periods of the function e * PoX / fi . (We will give a more 
precise meaning to the “slowly varying” condition shortly.) Thus, if we look 
at ipo(x ) on a distance scale of a small number of periods of the function 
e ip 0 x/n, thgjj ^jj 0 w j ]4 look like a constant times e WoX ^ n , which, as we have 
seen, represents a particle with momentum po. We expect, then, that the 
wave function ipo represents a particle with momentum approximately equal 
to po■ 

Let us now try to solve the free Schrodinger equation in terms of the 
amplitude and phase of the wave function. We write 

ip{x,t) = A(x,tW e{x ’ t] 
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where A and 9 are real-valued functions. If we plug this expression for i/j 
into the free Schrodinger equation and then cancel a factor of from 

every term, we obtain the equation 

dA .86 A _ ih d 2 A _ h dAd 6 _ ih A ( d 6\ 2 _ h^O 
dt dt 2 m dx 2 m dx dx 2 m \dx J 2 m dx 2 


Since A and 9 are real-valued, we may separately equate the real and 
imaginary parts of (4.12), giving 


8 A H 8 A 86 II „ 8 2 9 

dt to dx dx 2 to dx 2 

and (after dividing the imaginary part of (4.12) by A) 

89 _ h 1 8 2 A h / 89 \ 2 
dt 2 to A dx 2 2 to \ dx ) 


(4.13) 


(4.14) 


Any solution to this system of partial differential equations will yield a 
solution ip(x,t) = A(x,t)e ie ( x,t ^ to the free Schrodinger equation. 

Since we are assuming A is “slowly varying” compared to 9, it is reason¬ 
able to think that the first term on the right-hand side of (4.14) will be 
small compared to the second term. That is to say, we interpret the slowly 
varying condition to mean 


1 d 2 A f89\ 

A dx 2 ^ \dx J 


(4.15) 


where the symbol <C means “much smaller than.” We will take initial con¬ 
ditions such that (4.15) holds at t = 0, and then we will assume that (4.15) 
continues to hold at least for small positive times. We may then (to first 
approximation) drop the first term on the right-hand side of (4.14), giving 
the following simplified version of (4.14): 


89 _ _ h_ / 86 \ 

dt, 2 to \dx ) 


(4.16) 


We now look for a solution to the pair of equations (4.13) and (4.16) 
with initial conditions corresponding to (4.11). 

Proposition 4.6 A solution to the approximate equations (4-13) and 
(4-16) with initial condition 9(x, 0) =pqx/H is given by 


0(*,t) = f (*-£() (4.17) 

A(x, t) = A a (x — —t) . (4-18) 

V TO / 


and 
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This yields an approximate solution to the free Schrodinger equation 
given by 


( Po + 

\ 

[■Po 

( Po A 

(x -1 

) exp 

l-r 

1 x - x—C 

V m 

h 

V 2 m J 


(4.19) 


Note from (4.17) and (4.18) that if the “slowly varying” condition (4.15) 
holds at time 0, it will continue to hold for all positive times in our approx¬ 
imate solution. 

Proof. Although (4.16) is a nonlinear equation, we can find a solution to 
it with the simple initial conditions 9(x, 0) = Pqx/H, namely, 


9{x , t) 


p 0 x 

p » t 

2 mh 

h 

Po ( 
-h\ X 

Po A 

2 m J 


(4.20) 


Since dO/dx = po/h and d 2 9/dx 2 = 0, if we plug (4.20) back into (4.13) 
we obtain 

dA p 0 dA 

dt m dx 

The (presumably unique) solution to this linear equation with initial con¬ 
dition A(a;,0) = Aq(x) is 


A(x,t) = Ao(x--t) , (4.21) 

V m J 

as claimed. ■ 

We hope that the solution (4.19) to the system of equations (4.13) 
and (4.16) is a close approximation to the solution to the original pair of 
equations (4.13) and (4.14)— assuming, of course, that A 0 is slowly varying 
compared to 9q(x) = pox/H. It is not especially easy to estimate directly 
how rapidly solutions to (4.13) and (4.16) diverge from solutions to (4.13) 
and (4.14). We will therefore leave an estimate of the error in our approxi¬ 
mation until the next section, where we will obtain the same approximate 
solution by a different method. 

Note that a function of the form f(x, t) = <f>{x — vt) is moving to the right 
with constant velocity v. (If v is negative, then, of course, this means the 
function is moving to the left.) Observe that both the amplitude A(x, t) and 
the phase exp {i9(x,t)} are of this form, but with two different velocities. 


Conclusion 4.7 In the approximate solution (4-19) to the free Schrodinger 
equation, the amplitude A(x,t) is moving with velocity p^/m, whereas the 
phase 9(x, t ) is moving with velocity po / (2m). These two velocities are called 
the group velocity and the phase velocity, respectively: 

phase velocity = 

2 TO 

group velocity = —. 

m 
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Note that the formula for the phase velocity agrees with the one given 
previously in Sect. 4.1, the velocity of propagation of a pure exponential so¬ 
lution to the free Schrodinger equation. Indeed, nothing prevents us from 
taking Aq = 1, in which case the left-hand side of (4.15) is actually identi¬ 
cally zero, so that a solution to (4.13) and (4.16) is actually a solution to 
(4.13) and (4.14). 

Which of the velocities is the “real” velocity of the particle? The answer 
is: the group velocity. After all, the probability distribution for the parti¬ 
cle’s position is determined by the amplitude of the wave function and is 
unaffected by the phase. It is the amplitude that determines (as much as it 
can be determined) where the particle is. Thus, the true velocity of the par¬ 
ticle should be the velocity at which the amplitude propagates. Figure 4.2 
shows the propagation of the real part of a wave packet, with the motion 
of a single peak indicated by the shaded region. The phase velocity deter¬ 
mines the speed at which the individual peaks in the real part of ij) move, 
whereas the group velocity determines the speed of the packet as a whole. 
Since the peak we are tracking lags well behind the motion of the whole 
packet, we see that the phase velocity is smaller than the group velocity. 

We should expect that solutions to our approximate equations (4.13) 
and (4.16) will diverge slowly over time from solutions to the free 
Schrodinger equation (4.13) and (4.14). For sufficiently long times, there 
may be a significant difference between approximate and true solutions. 
This expectation is confirmed in Sect. 4.5, where we investigate the spread 
of the wave packet, a phenomenon that is not seen in our approximation. 


4.4 Propagation of the Wave Packet: Second 
Approach 

We have seen that the general solution of the free Schrodinger equation can 
be obtained by means of the Fourier transform as 

1 f°° 

tp(x, t ) = .— / i>o(k) exp [i {kx — co(k)t)] dk , (4.22) 

v 2tt J —oo 

where 

ftp 

w(*0 = 2^. (4.23) 

Let us assume that if >o has approximate momentum equal to po ■ Thus, we 
expect that V , o (k) will be concentrated near kg := po/h. If that is the case, 
then only the values of k close to fco are important. For k close to kg, we 
use the first-order Taylor expansion 

u>(k) « ui(ko) + u}'(k 0 )(k — k 0 ), (4.24) 


where for now we do not put in the explicit formula for a/(/co). 



4.4 Propagation of the Wave Packet: Second Approach 




- 'xA 

l\ 1 

|lAo 


I 

|F 



FIGURE 4.2. Propagation of Re['i/>], with motion of a single peak shaded 
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Inserting (4.24) into (4.22), we get two factors that are independent of k 
and come outside the integral, leaving us with 

1 , f'°° ~ 

i/>(x, t ) M -=e iu ^ 0 )k 0 t e -iu(k 0 )t / exp [ ik ( x _ w '(yt 0 )t)] dk 

v 27T J —oo 

= e i“'(k 0 ) k 0 t e -Mk 0 )t ^ o ( x _ w '( fco )i). ( 4 . 25 ) 

Note that the factors in front of ^ 0 ( 2 : — u>’(ko)t ) are simply constants, 
that is, independent of x. These constants do not affect the “state” of the 
system, in that we have said that two vectors in the quantum Hilbert space 
that differ by a constant represent the same physical state. Ignoring these 
constants, we are left with the factor of ^{x — ui'(ko)t ), which is simply 
shifting to the right at speed ui'(ko). Thus, the (approximate) velocity at 
which our wave packet is moving is 

velocity » u/(ko) = — = -■ 

m 777- 

Let us consider the special case in which q is of the form 

V’oOr) = e lkoX A 0 {x), 

where Aq is real and positive. Then (4.25) becomes 

e iu\k 0 )k 0 t e -iu{k 0 )tjk 0 (x-u\k 0 )t) A ^ x _ w '(fc Q ) t ). 

After canceling the terms involving u'(ko)kot in the exponent, we obtain 

ip{x,t) » e^ koX -^ ko)t) A 0 {x - u'(k 0 )t). 

Recalling that po = hko and putting in the formula for u>, we see that this 
approximation to ip(x, t) is precisely the same as the one we obtained, by 
a different method, in Proposition 4.6. 

As in Sect. 4.3, we see that the velocity at which a pure exponential 
solution of the free Schrodinger equation propagates [namely, uj(ko)/ko = 
Hko/( 2 m)\ is not the same as the velocity at which the overall wave packet 
propagates. Rather, as seen in (4.25), the wave packet propagates at a 
velocity given by u/(fco) = hk^/m. We may summarize this conclusion in 
the following proposition. 


Proposition 4.8 The speed at which a pure exponential solution of the 
free Schrodinger equation propagates is 


cj(k 0 ) hk 0 Po 
phase velocity = — - -= —— = -—. 


hko 
2m 


By contrast, the (approximate) speed at which the wave packet propagates is 

hk 0 


du> 

group velocity = — 
dk 


k—ko 


PO 

777- 
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The disadvantage of the method we used in Sect. 4.3 is that it does not 
easily yield estimates on how big an error there is in our approximation. 
In the current section, however, we can estimate the error by comparing 
the Fourier transforms of the exact solution and the approximate solution. 
Our error estimate will involve a quantity n defined as follows: 



(k — fc 0 ) 4 dk 



(4.26) 


The quantity k is, roughly, half the width of the interval around ko on 
which most of ip{k) is concentrated. If, for example, i/j is supported in the 
interval [fco — £, fco + e], then k < e, assuming that if ;— and therefore ip —is 
a unit vector. (A more common measure of concentration would replace 
(fc — ko ) 4 by [k — ko ) 2 and the fourth root of the integral by the square 
root. But the “quartic” measure of concentration in (4.26) is the one that 
arises in estimating the error of our approximations in this section.) 

Proposition 4.9 Let ip(x,t) be the exact solution to the free Schrodinger 
equation with initial condition ipo, and let <p(x,t) be the approximate solu¬ 
tion given by the right-hand side of (4-25). Then the following L 2 estimate 
holds: 

-#M)II l2(r) < ^ = |t| <*;(«), (4-27) 

where the L 2 norm is with respect to x with t fixed and where w(-) is defined 
by (4-23). 

Equation (4.27) means that the L 2 norm of the error will be small, pro¬ 
vided that 


If k is much smaller than fc 0 , then 1 /u>(k) will be much larger than l/w(fco). 
That means that the timescale on which the true and approximate solutions 
diverge will be long compared to the timescale on which our approximate 
solution is oscillating. 

Proof. Let ip(k, t) and <j>(k, t) denote the Fourier transforms of </> and ip 
with respect to x , with t fixed. From (4.22) we can read off that 

i>(k,t) = e-^^Mk). 


Meanwhile, (j){k : t) is obtained from ip(k : t) by replacing u>(k) by the right- 
hand side of (4.24). Now, direct calculation shows that 


uj(k) - (w(fc 0 ) + ijo'{k 0 )(k - k 0 )) 


h 

2m 


{k-ko) 2 . 
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From this expression and the elementary estimate |e* e — e*^| < \0 — 
we obtain 


- $(k,t) <^(fc-fco ) 2 do (k) 

Am 


(4.28) 


The estimate (4.27) then follows by the Plancherel theorem and the 
definition of n. ■ 

For a more detailed version of the approach used in this section, see 
Sect. 5.6 of [30], 


4.5 Spread of the Wave Packet 

We use the uncertainty (Definition 3.13) Ain the position of the particle 
as a measure of the “width” of ip(x) as a function of x. At the level of 
approximation considered in the previous two sections, the uncertainty in 
the position of a free particle is independent of time. After all, in the 
approximate solution (4.19), the amplitude of the wave function simply 
shifts to the right at a speed equal to the group velocity, without changing 
shape. A more precise calculation, however, shows that after sufficiently 
long times, the wave packet spreads out in space. (Exercise 7 gives an idea 
of the time scale on which this spread takes place.) 

We can compute the time-evolution of the uncertainty in the particle’s 
position without having to solve the full Schrodinger equation, by using 
Proposition 3.14 from Chap. 3. We start by observing that for a free par¬ 
ticle, our Hamiltonian is simply P 2 /(2m ), which commutes with P. It fol¬ 
lows that the expected value and uncertainty for the particle’s momentum 
(and, indeed, the entire probability distribution of the momentum) are in¬ 
dependent of time. Meanwhile, to compute the time-dependence of ( X ) 
and (A 2 ) , we use Proposition 3.14 along with the commutation relation 
[A, P] = ihl (Proposition 3.8). 

Proposition 4.10 For a wave function ip(x,t) evolving according to the 
free Schrodinger equation onM 1 , the expectation values for X andX 2 evolve 
as follows: 

<*W> = <*>*, + i <-?>*„ 

and 

A)*,,, = ( x \, + i< xp+px h» + ^ < p2 )*(0, ■ 

These relations imply the following result: 

(A#)A) 2 

= ^ (W-o Pf + ^ ((XP + PX ^ - 2 (A)*, <P> J + (A„ 0 A) 2 . 
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For a unit vector i/jq in P 2 (R), the uncertainty A ^ 0 P in the momentum 
cannot be zero, because the uncertainty would be zero only if ip o is an 
eigenvector for the momentum operator. But the eigenvectors for P are 
the functions of the form e ikx , which are not in L 2 (R). Thus, the leading 
coefficient in the expression for (A^^X ) 2 is never zero, and thus A^^X 
tends to infinity as t tends to infinity. 

Proof. We compute that 

[P 2 , X] = P 2 X - PXP + PXP - XP 2 
= P[P,X\ + [P,X\P 
= -2 ihP. 

Thus (as we have already noted in Sect. 3.7.5), 

i <W<> = < 429 > 

where we have used in the last equality that the expected momentum is 
independent of time. Since the derivative of {X)^, t ~. is constant, (X)^^ 
itself is a linear function of t, which gives the first result in the proposition. 
Meanwhile, a little algebra shows that 

[P 2 , X 2 ] = P [P, X] X + [P, X] PX + XP [P, X] + X [X, P] P 
= -2 iH(PX + XP), 

and 

[P 2 ,PX + XP] = P [P 2 , A] + [P 2 , X] P = -4 ihP 2 . 

Thus 

S P%«> = ds<P- A 'i>*(« = k {xp + P W<*> 

and 


dt 2 


P 2 W> 


^in« p2 ' XP + PX » 




771 


bo ' 


Since the second derivative of is independent of t, (A 2 )^^ itself 

is a quadratic polynomial in t, the coefficients of which are determined by 
the value of {X)^^ and its first two time-derivatives at t = 0. This leads 
to the second result in the proposition. The last result follows by direct 
calculation. ■ 
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4.6 Exercises 


1. A locally integrable function ip(x,t) satisfies the free Schrodinger 
equation in the weak (or distributional) sense if for each smooth com¬ 
pactly supported function x, we have 



ih d 2 x 

2 m dx 2 


dx dt = 0. 


(4.30) 


[One obtains (4.30) by assuming dip/dt — (ih/2m)d 2 ip / dx 2 is zero, 
integrating against x(a;,t), and then formally integrating by parts.] 


(a) Show that if ij)(x,i) is smooth as a function of x and t then ijj 
satisfies the free Schrodinger equation in the pointwise sense if 
and only if r/> satisfies the free Schrodinger equation in the weak 
sense. 

Hint: Proposition A.23 may be useful. 

(b) For any tjiQ G L 2 (R), define ip(x,t) by Definition 4.4. Show that 
ijj satisfies the free Schrodinger equation in the weak sense. 


First show that the function ipA given by 

r A 


ipA(x,t) = 


1 


Vio(fc)e i(fex “ w(fe)t) dk 


\/2 t: J-a 

satisfies the free Schrodinger equation in the weak sense, for each A. 


2. (a) Show that for any a £ C with Re(a) > 0, 

e -x 2 /(2a) dx \ = [ e -(x 2 +« 2 )/(2a) ^ dy 

) J R 2 

= 2ttcl, 

where the integral over R 2 can be evaluated using polar coordi¬ 
nates. Conclude that 




e x 2a ' > dx = V2na, 


(4.31) 


where the square root is the one with positive real part, 
(b) Show that for all A, B > 0 we have 


-x 2 /(2a) dx _ _® e -x 2 /(2a) 


f B dx 

I A X 1 


for any nonzero complex number a. Using this, show that the 
integral in (4.31) is convergent for all nonzero a with Rea > 0, 
provided the integral is interpreted as an improper integral (i.e., 
the limit as A tends to infinity of an integral from —A to A). 
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(c) Now show that the result of Part (a) is valid also for nonzero 
values of a with Re a = 0. 

Hint : Given /3 ^ 0, show that the (improper) integral from A 
to oo of exp[— x 2 /{2{a + ifi))} is small for large A, uniformly in 

Oi €! [ 0 , 1 ]. 

(d) Show that 

/ c ikx c ~ itnk2 /(2m) dfc — , / 171 c imx 2 /(2th ) 

27T J-oo V 27rife 

where the integral is interpreted as an improper integral and the 
square root is the one with positive real part. 

3. Suppose <p is a Schwartz function (Definition A.15) and ip belongs to 
L 2 (R). Show that the convolution ip * ip is smooth (infinitely differ¬ 
entiable). 

4. Consider the heat equation for a function ip(x,t)i given by 

dip O' 2 ip 
~dt =a !h?' 

where a is a constant, subject to the initial condition ip(x, 0) = ipo(x). 

(a) Derive a differential equation for ip(k,t ), the Fourier transform 
of a solution of the heat equation with respect to x, with t fixed, 
assuming that ip(x,t) is a “nice” function of x for each t. Solve 
this equation subject to the initial condition ip(k, 0) = ipo(k). 

(b) Obtain an expression for the solution to the heat equation as 
a convolution of ipo with a “fundamental solution” to the heat 
equation. 


Note: As we will discuss in Chap. 20, the heat equation can be thought 
of as a sort of “imaginary time” version of the free Schrodinger 
equation. 


5. 


Suppose we take an initial condition in the free Schrodinger equation 
with initial phase given by 9q(x) = pox/h and initial amplitude given 
by Aq(x), as in (4.11). Suppose also that the initial amplitude is of 
the form 


Aq{x) = exp 




Note that Aq is centered around the point Xq and that the parameter 
L is a measure of the “width” in space of our initial wave packet. 
A function of the form ipo(x) = e ip ° x ^ h Ao (x), with Aq as above, is 
called a Gaussian wave packet. 
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Compute the quantity 

1 ( 1 d 2 A 0 \ 

(OOo) 2 dx 2 ) ' 

V dx J 


(4.32) 


Assuming that h is small compared to Lpo, show that (4.32) is small, 
except at points where our initial wave packet is very small. 

Note: This shows that our “slowly varying” assumption (4.15) is rea¬ 
sonable for the case of Gaussian wave packets. 


6. The Klein-Gordon equation , a proposed relativistic alternative to the 
Schrodinger equation, is the equation 

1 d 2 il> d 2 ijj m 2 c 2 

c 2 dt 2 dx 2 h 2 ’ 

where m > 0 is the mass of the particle and c is the speed of light. 


(a) Obtain the dispersion relation for the Klein -Gordon equation, 
that is, the expression for u(k) that makes the function exp [i{kx— 
uj(k)t] a solution to the Klein-Gordon equation. 

(b) Show that the phase velocity ui(k)/k satisfies \uj(k)/k\ > c, that 
the group velocity dco(k)/dk satisfies \dui/dk\ < c, and that 

(phase velocity) (group velocity) = c 2 . 


Note: Since the Klein-Gordon equation is second order in time, there 
will be two possible values for co(k) for each k, one positive and one 
negative. The results of Part (b) hold for both of the two “branches” 
of to(k). 


7. Consider the uncertainty A^^X of a wave function evolving 
according to the free Schrodinger equation. Show that 


dt 


m 


(4.33) 


for all t and that 

lim — (A^r t )X) 

t-y+oo dt ^ ^ ; 


^-00 P 

m 


Note: By comparison, 


d 

dt 


Wm = 




(4.34) 


If 'il’o(k) is concentrated in a sufficiently small region around a nonzero 
number fc 0 = po/h, then A ^ 0 P will be small compared to {P)^, 0 ■ In 
that case, by comparing (4.33) to (4.34), we see that the rate at which 
the wave packet spreads out is small compared to the rate at which 
the wave packet moves. 
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A Particle in a Square Well 


5.1 The Time-Independent Schrodinger Equation 

It is difficult to solve the time-dependent Schrodinger equation explicitly, 
even in relatively simple cases. (Even for the free Schrodinger equation, 
we made do in Chap. 4 with solutions that are either approximate or that 
involve an integral that is not explicitly evaluated.) Usually, then, one ana¬ 
lyzes the time-independent Schrodinger equation (the eigenvector equation 
for H) and then attempts to infer something about the time-dependent 
problem from the results. There are a number of problems, including the 
harmonic oscillator and the hydrogen atom, in which the time-independent 
Schrodinger equation can be solved explicitly. 

In this section, we will consider a simple but instructive example, which 
can be solved by elementary methods. We consider the time-independent 
Schrodinger equation in M 1 , with a potential of the form 


V{x) 


-C, -A < x < A 
0, \x\ > A 


(5.1) 


where A and C are positive constants. The region —A < x < A is the 
“square well” for the potential (Fig. 5.1). 

Let us think first for a moment about the behavior of a classical particle 
in a square well. If we think of V as the limit of a sequence of potentials 
that change linearly from —1 to 0 in a small interval around ±1, we may 
expect the following behavior for a particle in a square well. If the energy 
of the particle is negative, then the particle must be in the well. In that 
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-A 


- QA - 

FIGURE 5.1. A square well potential. 


case, it will move with constant speed until it hits the edge of the well, 
at which point it will reflect instantaneously off the wall and move with 
the same speed in the opposite direction. If the energy of the particle is 
positive, it will move always in the same direction, with speed equal to one 
constant when it is not in the well and speed equal to a different constant 
when it is in the well. 

In the quantum case, we will be interested mainly in eigenvectors for the 
Schrodinger operator with negative eigenvalues {E < 0). Of course, on the 
quantum side of things, energy eigenvectors do not change in time, except 
for an overall phase factor. Nevertheless, since the classical particle with 
E < 0 spends the same amount of time in each part of the well, we may 
expect that the quantum particle will have approximately equal probability 
of being found in each part of the well. This expectation will be fulfilled 
for “highly excited states,” such as the one in Fig. 5.7. For the quantum 
particle, however, there is a small but nonzero probability of finding the 
particle outside the well, which is impossible classically. 

Our goal is to study the time-independent Schrodinger equation, that is, 
the eigenvalue equation 


- +V{x)i!){x) = Eijj{x), (5.2) 

2m ax A 

where both the eigenvalues E and the associated eigenvectors tjj (or “eigen¬ 
functions,” in physics terminology) are as yet unknown. As a second-order 
linear ordinary differential equation, this equation always has (for any value 
of E) a two-dimensional solution space. We are, however, looking for solu¬ 
tions that lie in the quantum Hilbert space L 2 (R). We will see there are 
actually only a finitely many E’s, all of them with E < 0, for which (5.2) 
has a nonzero solution in L 2 (R). In this case, then, the Schrodinger op¬ 
erator H has a discrete spectrum below zero and a continuous spectrum 
above zero. 
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5.2 Domain Questions and the Matching 
Conditions 


Before starting to solve (5.2), we must give some heed to the unbounded 
nature of the Hamiltonian operator. The Schrodinger operator 


H = — 


2 m dx 2 


+ V(X) 


on the left-hand side of (5.2) is an unbounded operator, meaning that there 
is no constant C such that ||i?V'|| < C ||^||, where ||-|| is the L 2 norm. On 
the other hand, we want to define H in such a way that it is self-adjoint. 
But according to Corollary 9.9, a self-adjoint operator that is defined on 
the whole Hilbert space must be bounded. 

We conclude, then, that H is not going to be defined on the entire Hilbert 
space L 2 (R), but only on a dense subspace thereof. In practical terms, 
saying that H is not defined on the whole Hilbert space means simply that 
for many functions ip in L 2 (R), the second derivative drip/dx 2 does not 
exist, or exists but fails to be in L 2 . (In our example, the potential V is 
bounded, and so Vip will always be in L 2 provided that ip is in L 2 .) 

Since the potential V for a square well is bounded, the domain of the 
Hamiltonian H = P 2 /(2m ) + V(X) is the same as the domain of the 
kinetic energy operator P 2 /(2m) = — {h 2 /2m)d 2 /dx 2 . As we will see in 
Sect. 9.7, the domain of the kinetic energy operator may be described as 
the space of L 2 functions ip for which d 2 ip/dx 2 , computed in the weak 
or distributional sense (Appendix A.3.3), again belongs to L 2 (R). This 
condition is equivalent to the statement that there exists some L 2 function 
(p such that ip is the second integral of (p (for some choice of the constants 
of integration). 

Meanwhile, since our potential is piecewise constant, any solution ip 
to (5.2) will be smooth except possibly at the transition points x = ±A, 
and both ip and ip' will have left and right limits at A and —A. Indeed, on 
each of the intervals (—oo, — A), (—A, A), and (A, oo), any solution to (5.2) 
will be simply a linear combination of (real or complex) exponentials. For 
functions of this sort, it is not hard to see when we are in the domain of H. 


Proposition 5.1 Suppose ip is smooth on each of the intervals (—oo, —A), 
(—A, A), and (A, oo). Then ip belongs to the domain of H [with potential 
function given by (5.1)] if and only if the (1) ip and dip/dx are continuous 
at x = ±A, and (2) dfip/dx 2 belongs to L 2 (M). 

Proof. Suppose first that ip satisfies the conditions (1) and (2). Then it is 
not hard to see (Exercise 1) that the second derivative of ip in the distribu¬ 
tion sense is simply the function d 2 ip/dx 2 , computed in the ordinary point- 
wise sense for x ^ ±A. (The second derivative may not exist at x = ±A, 
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but we simply leave d 2 ip/dx 2 undefined at these two points, which form a 
set of measure zero.) Thus, cPip/dx 2 , computed in the distribution sense, 
is an element of L 2 (K). 

On the other hand, if either ip of ip' has a discontinuity at x = A or at 
x = —A, then (Exercise 1 again) the distributional derivative will contain 
either a multiple of a (5-function of a multiple of the derivative of 5-function 
at one of these points. But neither a 5-function nor the derivative of 5- 
function is a square-integrable function. ■ 

Let us think about what the continuity condition on ip and dip/dx means 
in practical terms. Since V is constant on (—oo, — A), we can easily solve 
(5.2) on that interval, obtaining a two-dimensional solution space. Once we 
choose a solution from this solution space, then the values of ip and dtp/dx 
as x approaches —A from the left will serve as the initial conditions for solv¬ 
ing (5.2) on (—A, A). Thus, the requirement of continuity for ip and dip/dx 
serve as a “matching condition” between the solution on (—oo, —A) and the 
solution on (—A, A). We cannot just separately pick any solution to (5.2) 
on (—oo, —A) and any solution on {—A, A); at the boundary, the values of 
ip and dip/dx must match. (This same matching condition appears in el¬ 
ementary treatments of ordinary differential equations with discontinuous 
coefficients.) 

Once we pick a solution on (—oo, —A) we get a unique solution on 
{—A, A) —and then the values of ip and dip/dx as we approach A from 
the left will serve as the initial conditions for solving (5.2) on (A, oo). The 
conclusion is that once we pick a solution to (5.2) on (—oo, —A) (from the 
two-dimensional solution space), we have no additional choices to make; 
the differential equation along with the matching conditions give a unique 
way to extend the solution from (—oo, —A) to the whole real line. 


5.3 Finding Square-integrable Solutions 

If E > 0, then any solution to (5.2) will be a combination of two complex 
exponentials in the range x < —A ; such a function cannot be square- 
integrable unless it is identically zero. If, however, we take ip to be iden¬ 
tically zero in the region x < —A , then our continuity condition requires 
that ip and dip/dx approach 0 as i approaches —A from the right. Thus, 
the matching conditions at —A force the solution to be identically zero in 
[—A, A] as well. Finally, by matching across x = A, we get an identically 
zero solution on [A,oo). Thus, for E > 0, any solution to (5.2) satisfy¬ 
ing the continuity conditions in Proposition 5.1 must be identically zero. 
A similar analysis applies when E = 0, where the solutions to (5.2) on 
(—oo, A] would be of the form C\ + c^x, which is square-integrable only if 
ci = c 2 = 0. 
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The conclusion, then, is that to have a chance to get a solution to (5.2) 
that is square-integrable and in the domain of H , we must take E < 0. For 
E < 0, the solution to (5.2) on (—oo, —A) will be a linear combination of 
the two exponentials exp(aa;) and exp(— ax), where 


a = 


W\ 

h 


(5.3) 


For ip to be square-integrable over (—oo, —A), the coefficient of exp(— ax) 
must be zero, since this term grows exponentially as x tends to —oo. Thus, 
the value of ip on (—oo, — A) must be cexp(a:r). Once we choose a value 
for c, we get a unique solution on (—A, A) by matching ip and ip' across 
x = —A. We then get a unique solution on (A, oo) by matching across 
x = A. The solution on (A, oo) will be again be a linear combination 
of exp(aa;) and exp (—ax). For ip to be in L , we need the coefficient of 
exp(ax) on (A, oo) to be zero. We have no choice, however, about what ip 
is on (A, oo); the coefficient of exp(ax) either comes out to be zero or it 
does not. 

The conclusion, then, is that for any E < 0, there is a unique (up to a con¬ 
stant) solution to (5.2) that is square-integrable on the interval (—oo, — A). 
This solution then gives rise to a unique solution on (—A, A) and then to a 
unique solution on (A, oo), up to a constant. Unless we are lucky, the solu¬ 
tion on (A, oo) will grow exponentially and thus fail to be in L 2 . Therefore, 
in most cases there will be no nonzero solution to (5.2) that satisfies the 
continuity condition and is square-integrable over the whole real line. The 
hope is that for certain special values of E, we will be able to find a solu¬ 
tion that decays exponentially both on (—oo, —A) and on (A, oo), in which 
case the solution will belong to L 2 (R). 

It can be shown (Exercise 6) that there are no nonzero square-integrable 
solutions with E < —C. Therefore, any square-integrable solutions to (5.2) 
that may exist must come from the range —C < E < 0. To analyze this 
range, let us rewrite the time-independent Schrodinger equation by dividing 
through by — h 2 /(2m), yielding the equation 


cPip 

dx 2 


sip |x| > A 

— (c — s)ip |x| < A 


(5.4) 


where 


£ = 

C = 


2 mE 

W 

2 mC 
~h?~' 


(5.5) 


Note that although E is assumed to be negative, we have normalized e to 
be positive; the condition —C<E< 0 corresponds to 0 < s < c. 
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Because our potential function V is even, it is easy to see that for any 
solution ip to (5.4), the even and odd parts of ip are also solutions. We can, 
therefore, analyze even solutions and odd solutions separately. We begin 
with the even case. For x < —A, every solution to (5.4) that is square- 
integrable over (—oo, A ) is of the form 

ip(x) = ae^ x , x < —A. (5.6) 

Since we assume that ip is even, we then have 

ip{x) = ae~^ x , x > A. (5.7) 

Meanwhile, for — A < x < A, every even solution is of the form 

ip(x) = b cos (y/c — ex) . (5.8) 

Proposition 5.2 Let ip be the function defined in (5.6)-(5.8). Then there 
exist nonzero constants a and b so that ip belongs to the domain of H if 
and only if the following matching condition holds: 

yfe = \/c — e tan (yjc — eA) . (5.9) 

Proof. Clearly both ip and d 2 ip/dx 2 belong to L 2 (R). Thus, in light of 
Proposition 5.1, we need only ensure that ip(x) and ip'{x) are continuous 
at x = ±A Since the exponential functions are never zero, we may always 
ensure that ip itself is continuous by taking any value we like for b and then 
choosing a appropriately Once ip has been made to be continuous, ip' will 
be continuous provided that ip'(x)/tp(x) has the same value as we approach 
±A from inside the well or from the outside. To obtain the condition (5.9), 
we compute ip'/ip from (5.6) and then from (5.8), evaluate both quantities 
at x = —A, and then equate the two values of ip'/ip. Because we have 
made our solution an even function, we get the same matching condition 
at x = A as at x = —A. 

Now, in deriving (5.9), we implicitly assumed that ip is nonzero at x = 
±A. We do not, however, get any nonzero solutions in which ip{±A) = 0. 
After all, at points where the cosine function in (5.8) is zero, its derivative 
is nonzero. But no choice of the constant in front of the exponentials (5.6) 
and (5.7) will produce a function that is zero but has a derivative that is 
nonzero. ■ 

Proposition 5.3 For all positive values of c and A, there exists at least 
one e £ (0 , c) such that (5.9) holds. 

Proof. Case 1: \fcA < 7r/2. In this case, as e varies between 0 and c, 
the left-hand side of (5.9) will vary between 0 and some positive number, 
whereas the right-hand side of (5.9) will vary between some positive number 
and 0. By the intermediate value theorem, there must exist £ £ (0, c) for 
which (5.9) holds. See Fig. 5.2. 
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Case 2: y/cA > ir/2. In this case, there is eo G [0, c] for which yjc — EqA = 
7t/2. As e decreases from c to eo, the right-hand side of (5.9) will vary from 
0 to +oo. Thus, for e slightly larger than eo, the right-hand side of (5.9) 
will be larger than the left-hand side. By the intermediate value theorem, 
there must exist e € (eo, c) for which (5.9) holds. See Fig. 5.3 for a case 
i/cA slightly larger than 7 t/ 2 and Fig. 5.4 for a case with yfc.A much larger 
than 7t/2. ■ 

Note that if yfcA is much larger than 7 t/ 2, then there will be multiple 
solutions of (5.9), as can be seen in Fig. 5.4. 

We have found, then, at least one solution ip to (5.4) that satisfies the 
matching condition and for which both ip and ip” decay exponentially at 
infinity. Since this ip belongs to the domain of H , we have established the 
following result. 



Proposition 5.4 For any positive values of A and C, there exists at least 
one value of E in the range —C < E < 0 for which (5.2) has a nonzero 
solution in the domain of H, given by the formula 

{ cos (y/c — ex') —A < x < A 

cos (y/c — eA) exp[— y/e(\x\ — A)] \x\ > A 

where c and e are defined in (5.5) and where e satisfies (5.9). 

In Proposition 5.4, we have not normalized ip to be a unit vector in 
L 2 (K), but rather have normalized ip to equal 1 at the origin. In Figs. 5.5- 
5.7, we plot our eigenfunction in several different cases. In Fig. 5.5, we have 
a “shallow” well, with yfcA — 1. In that case, we obtain only one even 
eigenvector, which is the ground state of the system (i.e., the eigenvector 
with the smallest eigenvalue). Next, we consider a “deep” well, with yjcA = 
30. For this well, the ground state is shown in Fig. 5.5 and an “excited state” 
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FIGURE 5.3. Solving the matching condition, Case 2a. 




FIGURE 5.5. Ground state for a shallow potential well. 
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FIGURE 5.6. Ground state for a deep potential well. 



(i.e., an eigenvector with an eigenvalue that is not the smallest) is shown 
in Fig. 5.7. 

Note that in the shallow well, the ground state extends quite a bit beyond 
the interval [— A , A], whereas in the deep well, the ground state goes to zero 
very quickly as soon as we move outside the well. On the other hand, the 
excited state in Fig. 5.7 extends comparatively far outside the well. 

It is straightforward to adapt the preceding analysis to the odd case. The 
matching condition (5.9) is replaced by 

yfe = — y/c — e cot {y/c — eA ) (5.10) 

(Exercise 2) and the formula for the eigenvectors is now 

{ sin ( y/c — ex) —A < x < A 

± sin (y/c — eA ) exp[— yfe(\x\ — A)] \x\> A 

where we take the + sign for x > A and the — sign for x < —A. 
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FIGURE 5.9. An odd solution. 

If \fcA < 7t/ 2, then the matching condition (5.10) will have no solu¬ 
tions, since the right-hand side of (5.10) will be negative for all e £ (0,c). 
For large values of ^JcA , there will be several solutions to (5.10). A typical 
matching scenario and an associated eigenfunction are plotted in Figs. 5.8 
and 5.9. 


5.4 Tunneling and the Classically Forbidden 
Region 

Let us now briefly compare the classical situation to the quantum one. 
Classically, if a particle has energy E, then since the kinetic energy p 2 / (2m) 
is always non-negative, the particle simply cannot be located at a point x 
with V ( x) > E. Thus, the region V(x) < E may be called the “classically 
allowed” region and the region V{x) > E the “classically forbidden” region. 
In the case of a square well potential (5.1), if — C < E < 0, then the “well” 
itself (i.e., the region with —A < x < A) is the classically allowed region 
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and the outside of the well (i.e., the region with |x| > A) is the classically 
forbidden region. 

Quantum mechanically, if Hip = Eip, then the particle has a definite 
value for the energy, namely E. We see, however, that such a particle has 
a nonzero probability of being located in the classically forbidden region. 
Note that although the wave function is not zero in the classically forbidden 
region, it does decay exponentially with the distance from the classically 
allowed region. That is to say, the quantum particle can penetrate some 
distance into the classically forbidden region. Note, however, that if E is 
much less than zero—i.e., e is large—then a state with Hip = Eip will decay 
very rapidly outside the well (like exp[— yfe(\x\ — A)]). 

More generally, we can think about the time-dependent Schrodinger 
equation for a particle with energy approximately equal to E. If we require 
that the energy be exactly equal to E , then there is no interesting time- 
dependence, since the solution to the time-dependent Schrodinger equation 
is simply a constant time ipo. We can, however, think of a particle where 
the uncertainty in the energy is nonzero but small. Suppose such a particle 
is traveling through a region with V < E and then approaches a region 
with V > E (a “potential barrier”). Classically, the particle would just 
reflect off of this barrier and go back in the other direction. Quantum me¬ 
chanically, though, it is possible for the particle to “tunnel” through the 
potential barrier and come out the other side. That is to say, at some later 
time, there will be some non-negligible portion of the wave function on the 
far side of the barrier. 


5.5 Discrete and Continuous Spectrum 

Our analysis of the eigenvector equation (5.2) for —C < E < 0 shows that 
there are only finitely many values of E in this range for which we get 
square-integrable solutions. It is not hard to analyze the case E < —C 
with the result that all nonzero solutions grow exponentially in at least 
one direction (Exercise 6). Meanwhile, for E > 0, any solution to (5.2) on 
(—oo, —A) has sinusoidal behavior and is not square-integrable unless it 
is identically zero, in which case (by our matching condition) the solution 
must be zero everywhere. 

The upshot is that we obtain only finitely many square-integrable so¬ 
lutions to (5.2), up to multiplying each solution by a constant. Clearly, 
then, the “true” eigenvectors for H [i.e., the ones that actually belong to 
the Hilbert space L 2 (M)] cannot form an orthonormal basis for L 2 (R). 
Nevertheless, the spectral theorem (Chap. 7) provides something like a 
orthonormal-basis decomposition of elements of L 2 (R) in terms of the so¬ 
lutions to (5.2). A general element ip of L 2 (R) will be a sum of two terms. 
The first term is a linear combination of the true (L 2 ) eigenvectors for 
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H , which have E < 0. The second term is a continuous superposition 
(i.e., an integral) of the non-square-integrable “generalized eigenvectors” 
with E > 0. 

In Chap. 9, we will introduce the notion of the spectrum of a (possibly 
unbounded) self-adjoint operator A. We will see that a number A belongs 
to the spectrum of A if for all £ > 0 there exists a unit vector ip in the 
domain of A for which \\Aip — Xip\\ < e. In the case of the Hamiltonian 
operator H with a square well potential, it is not hard to show that every 
real number E with E > 0 belongs to the spectrum of H (Exercise 4.). 

It can be shown that if a number E < 0 is not an eigenvalue (i.e., if there 
are no nonzero L 2 3 solutions to Hip = Etp), then E is not an element of the 
spectrum of H. This result is hinted at by Exercise 5. Thus, the spectrum 
of H consists of a finite number of points in (-C, 0) (at least one), together 
with the whole half line [0, oo). 

5.6 Exercises 

1. (a) Suppose ip is a smooth function on each of the intervals 

(—oo,— A), {—A, A), and (A, oo) and that both ip and ip' are 
continuous at x = A and at x = —A. Show that for any smooth 
function \ with compact support, we have 



(5.11) 


where we leave ip" (x) undefined at x = ±A if the second deriva¬ 
tive does not exist at those points. (In light of Definition A.28, 
(5.11) means that the second derivative of ip, in the distribution 
sense, is simply the function ip".) 

Hint: Choose some interval [-R, i?] with R > A containing the 
support of x- Now use integration by parts separately on each 
of the intervals \—R,—A\, [—A, A], and [A,R], paying careful 
attention to the boundary terms. 

(b) Suppose now that ip is a smooth function on each of the inter¬ 
vals (—oo, — A), {—A, A), and (A, oo), and that both ip and ip' 
have left and right limits at x = ±A, but that, say, ip' has a 
discontinuity at x = —A. Show that (5.11) has to be modified 
by adding a nonzero multiple of x(~ A) to the right-hand side. 

2. Verify the matching condition (5.10) for odd solutions of the time- 
independent Schrodinger equation. 

3. Let w be a nonzero real number and consider a function of the form 


ip( x) = acos(cox) + bsin(ojx) 
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for real numbers a and b. If a and b are not both zero, show that for 
any A € R, we have 

f B 

lim / ip(x ) 2 dx = +oo. 

B-H-oo J A 

4. Let / be a C°° function on the interval (0,1) with the property that 
f(x) = 1 for 0 < x < 1/3 and f(x) = 0 for 2/3 < x < 1. Then define 
a family of “cutoff’ functions \ n on M by the formula 

\x\ > n + 1 
\x\ < n 

(n + 1) < x < —n 
n < x < n + 1 

Given any E > 0, let ip be a nonzero solution to (5.2) for which ip(x) 
and ij}'(x) are continuous at x = ±A. Let ijj n = il>Xn- Show that ip n 
belongs to the domain of H and that 

Hi> n ~ Eijj n 

= 0 ' 

Note: As we will see in Chap. 9, this implies that every real number 
E with E > 0 belongs to the spectrum of the operator H. 

Hint: In estimating ||^ n ||, it may be helpful to apply Exercise 3 to 
the real and imaginary parts of ip outside the well. 

5. Suppose E < 0 and suppose that there exists no nonzero square- 
integrable solutions to (5.2) for which ip and ip' are continuous. Let ip 
be a nonzero solution of (5.2) for which ip(x) and ip'{x) are continuous 
at x = ±A and let ip n be as in Exercise 4. Show that 

Hip n — Eip n 

¥J 

does not tend to zero as n tends to infinity. 

6. (a) Show that for E < —C, there are no nonzero square-integrable 

solutions to (5.2) for which ip and ip' are continuous. 

(b) Obtain the result of Part (a) when E = —C. 

Hint: Analyze the even and odd cases separately. 

7. Let the ground state for a particle in a square well denote the eigen¬ 
vector with the lowest (most negative) eigenvalue, which corresponds 
to the largest value for e. 


Xn{x) = 


0 

1 

f{~x — n) 
f(x — ri) 
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(a) Show that the ground state is always an even function. That is 
to say, show that the largest value of e satisfying (5.9) is always 
larger than any solution to (5.10). 

(b) Show that the ground state is a nowhere-zero function. 


6 

Perspectives on the Spectral Theorem 


6.1 The Difficulties with the Infinite-Dimensional 
Case 

Suppose A is a self-adjoint n x n matrix, meaning that A^j = Ajk for all 
1 < j, k < n. Then a standard result in linear algebra asserts that there 
exist an orthonormal basis {vj}" =1 for C" and real numbers Ai,...,A n 
such that Avj = A jVj. (See Theorem 18 in Chap. 8 of [24] and Exercise 4 
in Chap. 7.) 

We may state the same result in basis-independent language as follows. 
Suppose H is a finite-dimensional Hilbert space and A is a self-adjoint 
linear operator on H, meaning that ((f>,Ail>) = ( A<t>,ij)) for all £ H. 
Then there exists an orthonormal basis of H consisting of eigenvectors for A 
with real eigenvalues. 

Since there is a standard notion of orthonormal bases for general Hilbert 
spaces, we might hope that a similar result would hold for self-adjoint 
operators on infinite-dimensional Hilbert spaces. Simple examples, however, 
show that a self-adjoint operator may not have any eigenvectors. Consider, 
for example, H = L 2 ([0,1]) and an operator AonH defined by 

( Aip)(x ) = xip(x). (6.1) 

Then A satisfies (<j>,Aij;) = (Acj),^) for all <^>, 6 i 2 ([0,1]), and yet A 

has no eigenvectors. After all, if xip(x) = A ip(x), then ip would have to be 
supported on the set where x = A, which is a set of measure zero. Thus, 
only the zero element of L 2 ([ 0,1]) satisfies Ail> = A ip. 
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Now, a physicist would say that the operator A in (6.1) does have 
eigenvectors, namely the distributions S(x — A). (See Appendix A.3.3.) 
These distributions indeed satisfy xS(x — A) = A<5(cc — A), but they do 
not belong to the Hilbert space L 2 ([ 0,1]). Such “eigenvectors,” which be¬ 
long to some larger space than H, are known as generalized eigenvectors. 
Even though these generalized eigenvectors are not actually in the Hilbert 
space, we may hope that there is some sense in which they form something 
like a orthonormal basis. See Sect. 6.6 for an example of how such a “basis” 
might function. 

Let us mention in passing that our simple expectation of a true orthonor¬ 
mal basis of eigenvectors is realized for compact self-adjoint operators, 
where an operator A on H is said to be compact if the image under A of 
every bounded set in H has compact closure; see Theorem VI. 16 in Vol¬ 
ume I of [34]. The operators of interest in quantum mechanics, however, 
are not compact. (Of course, even if a self-adjoint operator is not compact, 
it might still have an orthonormal basis of eigenvectors, as, e.g., in the case 
of the Hamiltonian operator for a harmonic oscillator. See Chap. 11.) 

Meanwhile, there is another serious difficulty that arises with self-adjoint 
operators in the infinite-dimensional case. Most of the self-adjoint operators 
A of quantum mechanics are unbounded operators, meaning that there is 
no constant C such that \\Aip\\ < C||^>|| for all ip. Suppose, for example, 
that A is the position operator X on L 2 (M), given by (Xip)(x) = xt(>(x). If 
1 e denotes the indicator function of E (the function that is 1 on £ and 0 
elsewhere), then it is apparent that 

||^l[n,n+l] || — ^ || l[n,n+l] || 


for every positive integer n, and, thus, X cannot be bounded. Now, using 
the closed graph theorem and elementary results from Sect. 9.3, it can be 
shown that if A is defined on all of H and satisfies (cp, Aip) = {A<p, ip) for 
all </>, ip € H, then A must be bounded. (See Corollary 9.9.) Thus, if A is 
unbounded and self-adjoint, it cannot be defined on all of H. 

We define, then, an “unbounded operator on H ” to be a linear operator 
from a dense subspace of H—known as the domain of A —to H. The no¬ 
tion of self-adjointness for such operators is more complicated than in the 
bounded case. The obvious condition, that {cp, Aip) should equal (A<p, ip) for 
all (p and ip in the domain of A, is not the “right” condition. Specifically, 
that condition is not sufficient to guarantee that the spectral theorem ap¬ 
plies to A. Rather, for any unbounded operator A , we will define the adjoint 
A* of A , which will be an unbounded operator with its own domain. An 
unbounded operator is then defined to be self-adjoint if the domains of A 
and A* are the same and A and A* agree on their common domain. That 
is to say, self-adjointness means not only that A and A* agree whenever 
they are both defined, but also that the domains of A and A* agree. 
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6.2 The Goals of Spectral Theory 


Before getting into the details of the spectral theory, let us think for a 
moment about what it is we want the spectral theorem to do for us. In the 
first place, we would like the spectral theorem to allow us to apply various 
functions to an operator. We saw, for example, that the time-dependent 
Schrodinger equation can be “solved” by setting ip(t) = exp{— itH/h}ipo- 
Because the Hamiltonian operator H is unbounded, it is not convenient 
to use power series to define the exponential. If, however, H has a true 
orthonormal basis {e k } of eigenvectors with corresponding eigenvalues A„, 
then we can define exp{— itH/h} to be the unique bounded operator with 
the property that 


e- u »/ H e k 


_ e -it\ k /n 


Ck 


for all k. 

In cases where H does not have a true orthonormal basis of eigenvectors, 
we would like the spectral theorem to provide a “functional calculus” for 
H , that is, a system for applying functions (including exponentials) to H. 
This functional calculus should have properties similar to what we have in 
the case of a true orthonormal basis of eigenvectors. 

In the second place, we would like the spectral theorem to provide a 
probability distribution for the result of measuring a self-adjoint opera¬ 
tor A. Let us recall how measurement probabilities work in the case that 
A has a true orthonormal basis {e^} of eigenvectors with eigenvalues A j. 
Building on Example 3.12, we may compute the probabilities in such a case 
as follows. Given any Borel set E of M, let Ve be the closed span of all the 
eigenvectors for A with eigenvalues in E, and let Pe be the orthogonal 
projection onto Ve- Then for any unit vector ip, we have 


prob v ,(H G E) = (ip, P E ip) ■ (6.2) 

In particular, if the eigenvalues are distinct and ip decomposes as ip = 
yT Cj Cj , the probability of observing the value A j will be | cy | 2 (as in Ex¬ 
ample 3.12), since P{\j} is just the projection onto ej. 

In cases where A does not have a true orthonormal basis of eigenvectors, 
we would like the spectral theorem to provide a family of projection oper¬ 
ators Pe, one for each Borel subset fict, which will allow us to define 
probabilities as in (6.2). We will call these projection operators spectral 
projections and the associated subspaces Ve spectral subspaces. (Thus, Pe 
is the orthogonal projection onto Ve-) Intuitively, Ve may be thought of as 
the closed span of all the generalized eigenvectors with eigenvalues in E. 

In the first version of the spectral theorem, both these goals will be 
achieved, with the spectral projections being provided by a projection¬ 
valued measure and the functional calculus being provided by integration 
with respect to this measure. Although having (generalized) eigenvectors 
for a self-adjoint operator is, from a practical standpoint, of secondary 
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importance, we provide a framework for understanding such eigenvectors, 
using the concept of a direct integral. The second version of the spectral 
theorem decomposes the Hilbert space H as a direct integral, with respect 
to a certain measure /i, of generalized eigenspaces for a self-adjoint oper¬ 
ator A. The generalized eigenspace for a particular eigenvalue A will not 
actually be a subspace of H, unless /i({A}) > 0. Thus, the notion of a direct 
integral gives a rigorous meaning to the notion of “eigenvectors” that are 
not actually in the Hilbert space. 

6.3 A Guide to Reading 

Although the portion of this book devoted to spectral theory is unavoidably 
technical in places, it has been designed so that the reader can take in as 
much or as little as desired. The reader who is willing to take things on faith 
can simply take in the examples of the position and momentum operators 
in Sects. 6.4 and 6.6 and accept these as prototypes of how the spectral 
theorem works. The reader who wants more details can find the statement 
of the spectral theorem for bounded operators, in two different forms, in 
Chap. 7, and can find the basics of unbounded self-adjoint operators in 
Chap. 9. Finally, the reader who wants a complete treatment of the subject 
can find full proofs of the spectral theorem in both forms, first for bounded 
operators in Chap. 8, and then for unbounded operators in Chap. 10. 

6.4 The Position Operator 

As our first example, let us consider the position operator X. given by 
(Xijj)(x) = xip(x ), acting on the Hilbert space H = L 2 (R). As for the 
similar operator in Sect. 6.1, X has no true eigenvectors, that is, no eigen¬ 
vectors that are actually in H. If we think that the generalized eigenvectors 
for X are the distributions 6(x— A), A G R, then we may make an educated 
guess that the spectral subspace Ve should consist of those functions that 
“supported” on E, that is, those that are zero almost everywhere on the 
complement of E. (A superposition of the “functions” <5(cc — A), with A £ E, 
should be a function supported on E.) 

The spectral projection Pe is then the orthogonal projection onto Ve , 
which may be computed as 


Pe4> = 1 e4>, 


where 1 e is the indicator function of E. In that case, we have, follow¬ 


ing (6.2), 
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This formula is just what we would have expected from our discussion in 
Chap. 3, where we claimed that the probability distribution for the position 
of the particle is \i/j(x)\ 2 . 

Meanwhile, let us consider the functional calculus for X. If /(A) = A m , 
then f(X) should be just the mth power of X , which is multiplication by 
x m . It seems reasonable, then, to think that for any function /, we should 
define f(X) to be simply multiplication by f(x). In particular, the operator 
e laX should be simply multiplication by e“ E , which is a bounded operator 
on L 2 (R). 


6.5 Multiplication Operators 

Since the position operator acts simply as multiplication by the function 
x, it is straightforward to find the spectral subspaces and also to construct 
the functional calculus for X. We may consider multiplication operators in 
a more general setting. If H = L 2 ( X , p) and h is a real-valued measurable 
function on X, then we may define the multiplication operator Mh on 
L 2 (X,n) by 

Mh4> — hi)}. 

We can then construct spectral subspaces as 

Ve = {ip \ i/j is supported on h~ x {E)} 
and define a functional calculus by 

f(A) = multiplication by / o h. 

One form of spectral theorem may now be stated simply as follows: A 
self-adjoint operator A on a separable Hilbert space is unitarily equivalent 
to a multiplication operator. That is to say, there is some cr-finite mea¬ 
sure space (X, p) and some measurable function h on X such that A is 
unitarily equivalent to multiplication by h. (See Theorem 7.20.) Although 
this version of the spectral theorem is compellingly easy to state, there is 
slight modification of it, involving direct integrals, that is in some ways 
even better. See Sect. 7.3 for more information. 


6.6 The Momentum Operator 

Let us now see how the spectral theorem works out in the case of the 
momentum operator, P = —iH d/dx on L 2 (R). The “eigenvectors” for 
P are the functions e lkx , k £ M, with the corresponding eigenvalues be¬ 
ing hk. Although the functions e lkx are not in L 2 (R), the Fourier trans¬ 
form shows that any function in L 2 (R) can be expanded as a superposition 
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(i.e., continuous version of a linear combination) of these functions. (See 
Appendix A.3.2.) Indeed, the Fourier transform is very much like the de¬ 
composition of a vector in an orthonormal basis, in that the Fourier coeffi¬ 
cients ip(k) can be expressed in terms of the “inner product” of a function 
if; with e lkx : 

/ OO 

e~ lkx i\){x) dx = (2 tt)~ 1/2 {e lkx ,tp ) L2(R) , 

-OO 


if we ignore the fact that e lkx is not actually in L 2 . 

Indeed, physicists frequently understand the Fourier transform by assert¬ 
ing that the functions e lkx /\/2n form an “orthonormal basis in the contin¬ 
uous sense” for L 2 (R). Orthonormality in the continuous sense is supposed 
to mean that one replaces the usual Kronecker delta in the definition of an 
orthonormal set by the Dirac 5-function 


/ 


gikx gilx 


\ 


\ y/2n / L 2 ( R ) 


= S(k — l), 


where S is supposed to satisfy 



ms(k-i ) dk = /(i) 


(6.3) 


for all continuous functions /. (Rigorously, S(k — Z) is a distribution; see 
Appendix A.3.3.) 

To give some rigorous meaning to (6.3), note that although the inner 
product of e lkx and e llx is not defined, we may approximate this inner 
product by the expression 


1 

27T 



^—ikx^ilx 


dx = 


2ir —i(k — l) _ A 


A sin [ A(k — l )] 
7T A{k — l) 


It is possible to show that the above function, viewed as a function of k for 
fixed A and l , behaves like 5{k — l) in the limit as A tends to infinity. That 
is to say, for all sufficiently nice functions if), we have 


lim 

A—>-oo 



A sin [A(k — /)] 
7 r A(k — l ) 


dk = 


(6.4) 


Here is a heuristic argument for (6.4). By making the change of variable 
k' = k —l, we may reduce the general problem to the case l = 0. If we then 
make the change of variable k = Ak, the desired result is equivalent to 


1 sin k 


f(-r) dn = f( 0 ). 


lim 

A—>■+oo 


—oo 


7r K 


(6.5) 
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Now, if we can bring the limit inside the integral, f(n/A) will tend to /(0) 
as A tends to infinity. Since the rest of the integrand on the right-hand 
side of (6.5) is already independent of A, the result would then follow if we 
could show that 


f°° 1 sin 
7-00 7T K 


( 6 . 6 ) 


Even though the integral in (6.6) is not absolutely convergent, it is a con¬ 
vergent improper integral. The value of the integral can be obtained by the 
method of contour integration (or the method of consulting a table of in¬ 
tegrals), and indeed (6.6) holds. Since (6.3) is, in any case, only a heuristic 
way of thinking about the Fourier transform, we will not take the time to 
develop a rigorous version of the preceding argument. 

It is possible to derive, at least formally, many of the standard properties 
of the Fourier transform by using (6.3), just as one can obtain properties 
of Fourier series by using the orthonormality of the functions e 2mnx in 
L 2 ([0,1]). More importantly, the Fourier transform is precisely the unitary 
transformation that changes the momentum operator into a multiplication 
operator. To see this property of the Fourier transform more clearly, we 
introduce a simple rescaling of it. 

Definition 6.1 For any ip € L 2 (R), define ip by 


so that 


^ )= ^(D' 

ip{p) = -= / e~ ipx ^ip{x) dx. 

V27 Th J-oo 


The function ip(p) is the momentum wave function associated with ip. 

By the Plancherel theorem (Theorem A. 19) and a change of variable, if ip 
is a unit vector, then so is ip and also ip. For any unit vector ip , we interpret 
\ip{p)\ 2 as the probability density for the momentum of the particle, just as 
|^>(a;)| is the probability distribution of the position of the particle. Using 
Proposition A.17, we may readily verify that for nice enough ip , we have 


Pip{p) = pip{p). (6.7) 

Equation (6.7) means that the unitary map ip —> ip turns the momentum 
operator P into multiplication by p. That is to say, the spectral theorem, 
in its “multiplication operator” form, is accomplished in this case by the 
Fourier transform (scaled as in Definition 6.1). 

In terms of the momentum wave function, we may define spectral pro¬ 
jections and a functional calculus for P, just as in Sect. 6.5. For any Borel 
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set E C R, we may define a projection Pe to be the orthogonal projection 
onto to the space of functions ip for which ip(p) is zero almost everywhere 
outside of E. If / is any bounded measurable function on K, we can define 
an operator f(P) by defining f(P)ip to be the unique element of L 2 (R) for 
which 

f(P)ip{p) = /(p)'iZ'(p). 


7 

The Spectral Theorem for Bounded 
Self-Adjoint Operators: Statements 


In the present chapter, we will consider the spectral theorem for bounded 
self-adjoint operators, leaving a discussion of unbounded operators to 
Chaps. 9 and 10. The proofs of the main theorems (two different versions 
of the spectral theorem) are moderately long and are deferred to Chap. 8. 
After some elementary definitions and results in Sect. 7.1, we come to the 
main results in Sects. 7.2 and 7.3. Throughout the chapter, H will, as usual, 
denote a separable Hilbert space over C. 


7.1 Elementary Properties of Bounded Operators 


As usual, we will let H denote a separable complex Hilbert space. Recall 
from Appendix A.3.4 that a linear operator A on H is said to be bounded 
if the operator norm of A, 


1011 := 


sup 


JIM 

Ml 


(7.1) 


is finite. The space of bounded operators on H forms a Banach space under 
the operator norm, and we have the inequality 


\\ab\\<\\a\\\\b\\ 


(7.2) 


for all bounded operators A and B. 


Definition 7.1 The Banach space of bounded operators on H, with respect 
to the operator norm (7.1), is denoted £>(H). 
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Recall (Appendix A.4.3) that for any A £ 0(H) there is a unique operator 
A* £ 0(H), called the adjoint of A, such that 

(</>,Ai/>) = (A*<f>, %l>) 

for all <fi,ip £ H. An operator A £ 0(H) is called self-adjoint if A* = A. 
We say that A £ 0(H) is non-negative if 

PPP> 0 (7.3) 

for all (ifH. 

Proposition 7.2 For all A £ 0(H), we have 

\\A*\\ = \\A\\ 

and 

\\A*A\\ = \\A\\\ 

In particular, if A is self-adjoint, we have the useful result that ||A 2 || = 

IPf- 

Proof. The operator norm of A can also be computed as 

Pll = sup \\Aip \\. 

M=i 

Furthermore, for any vector cf> £ H, |p|| = sup|| x || =1 |(x, 4>)\- (Inequality 
one direction is by the Cauchy-Schwarz inequality, and inequality the other 
direction is by taking y to be a multiple of <f>.) Thus, 

IPII = sup \{<f>,Aip)\. 

IMI=M=i 

From this, we get 

IP* II = sup IP, A*i/j)\ 

Il0ll=llbll=i 

= sup \(A(j),ip)\ 

Il0ll=llbll=i 

= sup \{if,A(j))\ 

Il0ll=llbll=i 

= IPII- 

Meanwhile, ||A*A|| < ||A*|| ||A|| = ||A|| 2 . On the other hand, 

\\A*A\\= sup \{c/),A*Aip)\ 

m=\m =i 

= sup \{A(p,Aip)\ 

Il0ll=llbll=i 

> sup \(Aip,Afj)\ 
llbll=i 

= IPI| 2 , 
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which establishes the inequality in the other order. ■ 

We now record an elementary but very useful result. 

Proposition 7.3 For all A £ B( H), we have 

[Range(A)] 1 ' = ker(A*), 

where for any B £ BfH.), ker (B) denotes the kernel of B. 

Proof. Suppose first that if belongs to [Range(A)] x . Then for all <j> £ H, 
we have 

0 = {ip,A<p) = {A*ip,cp). (7.4) 

This implies that A*ip = 0 and thus that if £ ker(A*). Conversely, suppose 
ip £ ker(A*). Then for all <p £ H, (7.4) holds (reading the equation from 
right to left). This shows that ip is orthogonal to every element of the form 
A(p, meaning that ip £ [Range(A)] ± . ■ 

Next, we define the spectrum of a bounded operator, which plays the 
same role as the set of eigenvalues in the finite-dimensional case. 

Definition 7.4 For A £ B( H), the resolvent set of A, denoted p{A) 
is the set of all X £ C such that the operator (A — XI) has a bounded 
inverse. The spectrum of A, denoted by a (A), is the complement in C of 
the resolvent set. For X in the resolvent set of A, the operator (A — A I)^ 1 
is called the resolvent of A at A. 

Saying that (A — XI) has a bounded inverse means that there exists a 
bounded operator B such that 


{A - XI) B = B(A - XI) = I. 

If A is bounded and A — XI is one-to-one and maps H onto H, then it 
follows from the closed graph theorem (Theorem A.39) that the inverse 
map must be bounded. Thus, the resolvent set of A can alternatively be 
described as the set of A £ C for which A — XI is one-to-one and onto. 

Proposition 7.5 For all A £ B( H), the following results hold. 

1. The spectrum cr{A) of A is a closed, bounded, and nonempty subset 

of C. 

2. If |A| > ||A||, then X is in the resolvent set of A. 

Lemma 7.6 Suppose X £ f?(H) satisfies ||X|| < 1. Then the operator 
/ — X is invertible, with the inverse given by the following convergent series 
in B{ H): 


(/ - X) -1 =I + X + X 2 +X 3 + ■■■ 


(7.5) 
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Proof. As a consequence of (7.2), we have ||X m || < ||X || m . The (geometric) 
series on the right-hand side of (7.5) is therefore absolutely convergent and 
thus convergent in the Banach space B(H) (Appendix A.3.4). If we multiply 
this series on either side by (/—X), everything will cancel except /, showing 
that the sum of the series is the inverse of (/ — X). m 
Proof of Proposition 7.5. For any nonzero A € C, consider the operator 


A 


A- XI = -A( I - j 


If |A| > ||A||, then ||A/A|| < 1, and / — A/A is invertible by the lemma. It 
then follows that A — XI is invertible, with 


(A-XI) ---[I+ T + — 


A A 2 


A A 2 


(7.6) 


Thus, A is in the resolvent set of A. This establishes Point 2 in the propo¬ 
sition and shows that <r(A) is bounded. 

Suppose now that Ao £ C is in the resolvent set of A. Then for another 
number A £ C, we have 


A - AJ = A - A 0 7 - (A - A 0 )/ 

= (A - A 0 /) (/ - (A - Ao) (A - Ao/)" 1 ). 


(7.7) 


Thus, if 


IA — Aq | < 


1 


II (-^ — A 0 /) _1 1| ’ 


both factors on the right-hand side of (7.7) will be invertible, so that A — XI 
is also invertible. Thus, the resolvent set of A is open and the spectrum is 
closed. 

To show that a (A) is nonempty, note that A — XI may be computed as 
follows: 

(A - XI)- 1 = (/ - (A - A 0 )(A - A 0 /)- 1 ) _1 (A - Aq/)" 1 

= ( E (A - A 0 ) m ((A - Ao/)" 1 )" 1 ) (A - Ao/)" 1 . (7.8) 

\m—0 ) 

Thus, near any point Ao in the resolvent set of A, the resolvent (A — A/) -1 
can be computed by the locally convergent series (7.8) in powers of A — Ao, 
with the coefficients of the series being elements of B( H). For any <fi, ip £ H, 
the map 

A ^ (cj>, (A - A/)"V> (7.9) 

will be given by a locally convergent power series with coefficients in C, 
meaning that the function (7.9) is a holomorphic function on the resolvent 
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set of A. Furthermore, from (7.6) we can see that ||(A — A/) _1 || tends to 
zero as |A| tends to infinity, and so also does the right-hand side of (7.9). 

If u(A) were the empty set, the function (7.9) would be holomorphic 
on all of C and tending to zero at infinity. By Liouville’s theorem, the 
right-hand side of (7.9) would have to be identically zero for all (p and 
ip, which would mean that ( A — A/) -1 is the zero operator. But since 
{A — A I) (A — A/) -1 = I, the operator {A — A/) -1 cannot be zero. ■ 

If Aip = A ip for some A € C and some nonzero ip £ H, then {A — A I) has 
a nonzero kernel and so A is in the spectrum of A. Thus, any eigenvalue 
for A is contained in the spectrum of A. In the infinite-dimensional case, 
however, the converse is not true: A point in the spectrum may not be an 
eigenvalue for A. Nevertheless, for a bounded self-adjoint operator A, the 
spectrum of A may be described in a way that is not too far removed from 
what we have in the finite-dimensional case. 


Proposition 7.7 If A £ 0(H) is self-adjoint, then the following results 
hold. 

1. The spectrum of A is contained in the real line. 

2. A number A £ R belongs to the spectrum of A if and only if there 
exists a sequence ip n of nonzero vectors in H such that 


lim 

n—too 


\\Alp n - X4’ n \\ 

II Vvi II 


= 0 . 


(7.10) 


Condition 2 in the proposition says that AsR belongs to the spectrum 
if and only if A is “almost an eigenvalue,” meaning that there exists ip yf 0 
for which Aip is equal to A ip plus an error that is small compared to the 
size of ip. 

Lemma 7.8 If A £ B{ H) is self-adjoint, then for all A = a + ib £ C, we 
have 

((A - A I)if, {A - A I)ip) > b 2 (ip, if). (7.11) 

Proof. We compute that 

((A — (a + ib)I)ip, (A — (a + ib)I)ip ) 

= ((A — al)ip, (A — al)ip) + ib (ip, (A — al)ip) 

- ib((A — al)ip,ip) +b 2 (ip, ip). (7-12) 

Since A is self-adjoint, so is A — al , from which we see that the second and 
third terms on the right-hand side of (7.12) cancel, leaving us with 

((A - A I)ip, (A - A I)ip) = ((A - al)ip, (A - al)ip) + b 2 (ip, ip ), 


from which the desired inequality follows. ■ 
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Proof of Proposition 7.7. For Point 1, we need to show that any complex 
number A = a + ib with b ^ 0 belongs to the resolvent set of A. Since 
b ^ 0, (7.11) shows that A — XI is injective. Meanwhile, by Proposition 7.3, 
Range(A — A I) 1 - = ker(A — XI). Since A also has nonzero imaginary part, 
A — XI is injective, and so the range of A — XI is dense in H. To show 
that the range is all of H, consider any <j) G H and choose a sequence 
(j) n = (A — XI)i/j n in Range(A — XI) with —> (j>. Applying (7.11) with if 
replaced by if n — i/j m shows that (ip n ) is a Cauchy sequence. Thus, if n —1 ip 
for some if GH. Since A is bounded, 

(A — A I)ij) = lim (A — XI)ij) n = lim <f> n = (f>. 

n—too n—too 

We conclude, then, that A—XI is one-to-one and onto. The inverse operator 
(A — A/) -1 is bounded, by (7.11) (or by the closed graph theorem). 

For Point 2, assume there exists a sequence as in (7.10), and suppose that 
A—XI had an inverse. Letting <f> n = ( A—XI)ip n , we have ip n = (A—XI)~ x (f> n 
and so (7.10) says that 

lim _ IMtJJ_= 0 

n->oo||(A-AJ)—i&J ’ 

which shows that (A — A/) -1 is actually unbounded. Thus, A — XI cannot 
have a bounded inverse. 

Conversely, if, for some A G t, no such sequence exists, then there exists 
some e > 0 such that 

\\(A-XI)if,\\>e\m (7.13) 

for all if) G H. Then A — XI is injective and Proposition 7.3 tells us that 
the range of the self-adjoint operator A — XI is dense in H. Arguing as in 
the preceding paragraphs with (7.13) in place of (7.11), we can see that the 
range of A — XI is also closed, hence all of H. This shows that A — XI has 
an inverse. ■ 

Example 7.9 Let H = L 2 ([ 0,1]) and let A be the operator on H defined 
by 

(A'0)( x) = xi/j(x). 

Then this operator is bounded and self-adjoint, and its spectrum is given by 

°’(A) = [0,1]. 

As we have already noted in Sect. 6.1, the operator A does not have any 
(true) eigenvectors. 

Proof. It is apparent that \\Afi>\\ < ||'i/;|| and that (<p,Aip) = (A<j>,il>) for all 
G H, so that A is bounded and self-adjoint. Given A G (0,1), consider 
the functions ij> n := l[AA+i/nl> which satisfy H^nll 2 = 1/n. On the other 
hand, since \x — A| < 1/n on [A, A + 1/n], we have 

||(A-A/hM 2 < l/™ 3 - 
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Thus, by Proposition 7.7, A belongs to the spectrum of A. Since this holds 
for all A G (0,1) and the spectrum of A is closed, it (A) D [0,1]. 

Meanwhile, if A ^ [0,1], then the function \/(x — A) is bounded on 
[0,1], and so A — XI has a bounded inverse, consisting of multiplication by 
l/(x — A). Thus, a {A) = [0,1]. ■ 

7.2 Spectral Theorem for Bounded Self-Adjoint 
Operators, I 

7.2.1 Spectral Subspaces 

Given a bounded (for now) self-adjoint operator A , we hope to associate 
with each Borel set E C cr(A) a closed subspace Ve of H, where we think 
intuitively that Ve is the closed span of the generalized eigenvectors for A 
with eigenvalues in E. [We could do this more generally for any E C R, 
but we do not expect any contribution from K\tr(A).] We would expect the 
collection of these subspaces to have the following properties. 

1. Va(A) = H and V 0 = {0}. 

2. If E and F are disjoint, then Ve T Vf- 

3. For any E and F, Vehf = Ve D Vf- 

4. If Ei, E 2 , ■ ■ . are disjoint and E = U,■ Ej , then 

V e = A- 

j 

5. For any E, Ve is invariant under A. 

6. If E c [Ao — e, Ao + e] and i/> 6 Ve, then 

\\(A-X 0 I)^\\<eU\\. 


The condition = H captures the idea that our generalized eigenvec¬ 

tors should span H, while Property 2 captures the idea that our generalized 
eigenvectors should have some sort of orthogonality for distinct eigenval¬ 
ues, even if they are not actually in the Hilbert space. In Property 4, there 
may be infinitely many of the Ej’ s, in which case, the direct sum is in the 
Hilbert space sense (Definition A.45). Properties 5 and 6 capture the idea 
that Ve is made up of generalized eigenvectors for A with eigenvalues in E. 
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1.2.2 Projection-Valued Measures 

It is convenient to describe closed subspaces of a Hilbert space H in terms of 
the associated orthogonal projection operators. Recall (Proposition A.57) 
that, given a closed subspace T of H, there exists a unique bounded op¬ 
erator P that equals the identity on V and equals zero on the orthogonal 
complement V ± of V. This operator is called the orthogonal projection 
onto V and satisfies P 2 = P and P* = P. The following definition ex¬ 
presses the first four properties of our spectral subspaces—the ones that 
do not involve the operator A —in terms of the corresponding orthogonal 
projections. Since those properties are similar to those of a measure, we 
use the term projection-valued measure. 

Definition 7.10 Let X be a set and fl a a-algebra in X. A map p : —> 
0(H) is called a projection-valued measure if the following properties 
are satisfied. 

1. For each E £ 12, p{E) is an orthogonal projection. 

2. p(0) = 0 and p{X) = I. 

3. If Ei, 1 ? 2 , 03 , • • ■ in fl are disjoint , then for all v £ H, we have 



oo \ oo 


hi U E J I v = '52n{ E i)v, 


3= 1 } j =1 


where the convergence of the sum is in the norm topology on H. 

4- For all Ei,E 2 € fl, we have p(E\ D £ 2 ) = p(Ei)p(E 2 ). 

Note that if E\ and E 2 are disjoint, then Properties 2 and 4 tell us 
that p{Ei)p{E 2 ) = 0, from which it follows (Exercise 10) that the range 
of p{Ei) and the range of ^(^ 2 ) are perpendicular. It is then not hard to 
verify that p{Ei)p{E 2 ) is the projection onto the intersection of the ranges 
of p{E{) and p(E 2 ) (Exercise 11). Thus, if we define, for each E £ Q, a 
closed subspace Ve := Range(^(E)), then the collection of Vg’s satisfy the 
first four properties that we anticipated for spectral subspaces. 

In the next subsection, we will associate a projection-valued measure p A 
with each bounded self-adjoint operator A. In that case, the projection 
p A (E) will be thought of as a projection onto the spectral subspace cor¬ 
responding to E. We are about to introduce the notion of operator-valued 
integration with respect to a projection-valued measure. In the case of the 
projection-valued measure p A associated with A, this operator-valued in¬ 
tegral will be the functional calculus for A. 

Observe that, for any projection-valued measure p and ip £ H, we can 
form an ordinary (positive) real-valued measure p^ by setting 


Pl>(E) = {j>, p(E)ip) 


( 7 . 14 ) 
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for all E £ LI. This observation provides a link between integration with 
respect to a projection-valued measure and integration with respect to an 
ordinary measure. 

Proposition 7.11 (Operator-Valued Integration) Let LI be a a-alge- 
bra in a set X and let p : LI —>■ B( H) be a projection-valued measure. Then 
there exists a unique linear map, denoted f H > f n f dp, from the space of 
bounded, measurable, complex-valued functions on LI into B(H) with the 
property that 

(i>, i^J f dp'j ip\ = j f dp^ (7.15) 

for all f and all feH, where p^ is given by (7. If). This integral has the 
following additional properties. 

1. For all E £ LI, we have 


[ 1 E dp = p(E). 
J x 


In particular, the integral of the constant function 1 is I. 

2. For all f, we have 


f dp 


lx 


< sup |/(A)|. 


(7.16) 


3. Integration is multiplicative: For all f and g, we have 


J^fg dp = f dp'j g dp 


(7.17) 


j. For all f, we have 


L Id “={L fd ») ■ 

In particular, if f is real-valued, then f x f dp is self-adjoint. 

By Property 1 and linearity, integration with respect to p has the ex¬ 
pected behavior on simple functions. It then follows from Property 2 that 
the integral of an arbitrary bounded measurable function / can be computed 
as follows. Take a sequence s n of simple functions converging uniformly to 
/; the integral of / is then the limit, in the operator norm topology, of the 
integral of the s n ’s. 

Although the multiplicative property of the integral may seem surprising 
at first, observe that for any E\, E 2 £ LI, Property 3 in Definition 7.10 tells 
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us that 

(/. lfii dfj^j (^j 1 e 2 d/xj = ii(E 1 )h(E 2 ) = n(Ei D E 2 ) 

= / lEi • 1^2 <4*- 
J X 

Thus, multiplicativity of the integral at the level of indicator functions is 
built into the definition of a projection-valued measure. 

If one wanted to make a real-valued measure for which the corresponding 
integral was multiplicative, then since \e ■ 1e = 1 e, the integral of 1 e — 
namely, /i(U)—would have to satisfy /i( E ) 2 = fi(E). This would mean 
that n(E) is 0 or 1 for all E. For such measures, one would indeed obtain 
multiplicativity of the integral, but measures with this property are not 
very interesting. For operator-valued measures, we can have interesting 
examples where the integral is multiplicative, simply because there are 
many more idempotents (elements A with A 2 = A) in £>(H) than in R. 
Proof of Proposition 7.11. Given a projection-valued measure /i and a 
bounded measurable function / on X , define a map Qf : H —> Cby 

QfW = [ f d/A/>, 

J X 

where is given by (7.14). If / is an indicator function, then Qf(ip) = 
(t/), fi(E)i/j) is a bounded quadratic form. (See Definition A.60.) It is straight¬ 
forward to show, passing from indicator functions to simple functions and 
then to general functions, that for any bounded measurable /, Qf is a 
bounded quadratic form, with 

l<2/(V0l < (sup |/(A)|) HV’II 2 ■ (7.18) 

VAeJC / 

It then follows from Proposition A.63 that there is a unique bounded 
operator Af such that 

QfW = <Vb Af^> 

for all ip € H. We set f x f dfj, = Af. From the way Af is defined, it 
satisfies (7.15). The uniqueness of the linear map / i—► J x fdfi follows 
from the uniqueness in Proposition A.63. 

If / = 1 e, then Qf{ip) = Hip(E) = (ip, fi(E)ip), in which case the unique 
associated operator Af is u(E). This establishes Property 1. Property 2 
follows from (7.18). 

For Property 3, we have already observed that multiplicativity of the 
integral, at the level of indicator functions, is built into the definition of a 
projection-valued measure. Since both sides of (7.17) are bilinear in (< p,ip ), 
we have (7.17) for simple functions. Using Property 2, we can then ob¬ 
tain (7.17) for all bounded measurable functions by taking limits. 
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Finally, if / is real valued, then Qf{ip) will be real for all ip £ H. Thus, by 
Proposition A.63, the associated operator Af will be self-adjoint. Property 
4 then follows by linearity. ■ 

7.2.3 The Spectral Theorem 

We are ready to state one version of the spectral theorem for bounded 
self-adjoint operators. 

Theorem 7.12 (Spectral Theorem, First Form) If A £ H) is self- 
adjoint, then there exists a unique projection-valued measure p A on the 
Borel a-algebra in cr(A), with values in projections on H, such that 

[ A dfj, A ( A) = A. (7.19) 

Ja(A) 

Since the spectrum a (A) of A is bounded, the function /(A) := A is 
bounded on o(A). The proof of this theorem is given in Chap. 8. 

Definition 7.13 (Ftmctional Calculus) If A £ B(H) is self-adjoint and 
f : <7(A) —> C is a bounded measurable function, define an operator f(A) 
by setting 

f(A) = [ /(A) dp A ( A), 

J <x(A) 

where p A is the projection-valued measure in Theorem 7.12. 

We may extend the projection-valued measure p A from a (A) to all of 
R by assigning measure 0 to R\ o'(A). Then, roughly speaking, f(A) is 
the operator that is equal to /(A)/ on the range of the projection operator 

p A ([ A, A + dX)). 

Since the integral with respect to p A is multiplicative, it follows from 
(7.19) that if /(A) = A m for some positive integer m, then f(A) is the 
mth power of A. Further, since the series e aX = o( a ^) m / m - converges 

uniformly on the compact set o(A), the operator e aA (computed using the 
functional calculus for the function /(A) = e aX ) may be computed as a 
power series. 

Definition 7.14 (Spectral Subspaces) For A £ £>(H), let p A be the 

associated projection-valued measure, extended to be a measure on 1R by 
setting p, A (S. \ o(A)) = 0. Then for each Borel set E C R, define the 

spectral subspace Ve of H by 

V E = Range (p, A (E)). 

The definition of a projection-valued measure implies that these spectral 
subspaces satisfy the first four properties listed in Sect. 7.2.1. We now show 
that (7.19) implies the remaining two properties we anticipated for the 
spectral subspaces. 
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Proposition 7.15 If A G B(H) is self-adjoint, the spectral subspaces as¬ 
sociated with A have the following properties. 

1. Each spectral subspace Ve is invariant under A. 

2. If E C [Ao — £, Ao + e] then for all if) £ Ve, we have 

\\{A- X 0 I)if\\ < e II-0H. 


3. The spectrum of A\ v is contained in the closure of E. 

4- If Ao is in the spectrum of A, then for every neighborhood U of Ao, 
we have Vjj ^ {0} ; or, equivalently, ^ 0. 

Proof. For Point 1, observe that for any bounded measurable functions / 
and g on cr(A), the operators f(A) and g(A) commute, since the product 
in either order is equal to the integral of the function fg = gf with respect 
to /i A . In particular, A, which is the integral of the function /(A) = A, 
commutes with g A (E), which is the integral of the function 1 e- Thus, 
given a vector g A (E)<f in the range of g, A (E), we have 

Ap A (E)<f = p A (E)A(f, 

which is again in the range of g A (E), establishing the invariance of the 
spectral subspace. 

For Point 2, suppose that if G Ve, where E C [Ao — £, Ao + £]. Then if is 
in the range of / j a (E ), and so 

(A-X 0 I)if = (A-X Q I)g A (E)if. 

But g A (E) = 1 e(A) and A — XqI = f(A), where /(A) = A — Ao- By the 
multiplicativity of the integral, then, 


(A-X 0 I)if = (fl E )(A)if. 

But |/(A) 1 £;(A)| < £ and so by (7.16), the operator (/1 e)(A 1) has norm at 
most £. 

For Point 3, if Ao is not in E, then the function g( A) := 1 b (A)( 1/(A-A 0 )) 
is bounded. Thus, g(A) is a bounded operator and 


g(A)(A - Ao I) = (A - X 0 I)g(A) = 1 E (A). 


This shows that the restriction to Ve of g(A) is the inverse of the restriction 
to Ve of A. Thus, Ao is not in the spectrum of A\ Ve . 

For Point 4, fix Ao G &{A) and suppose for some £ > 0, we have p((Xo — 
£, Aq + £)) = 0. Consider, then, the bounded function / defined by 



| A — Ao | > £ 

IA — Aq | < £ 
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Since /(A) • (A — Ao) equals 1 except on (Ao — e, Ao + e), the equation 
/(A) • (A — Ao) = 1 holds /^-almost everywhere. Thus, the integral of this 
function coincides with the integral of the constant function 1, which is I. 
Since the integral is multiplicative, we see that 

f(A)(A - A 0 /) = (A- X 0 I)f(A) = /, 

showing that the bounded operator f(A) is the inverse of ( A — Ao/). This 
contradicts the assumption that Ao £ <j(A). m 

Proposition 7.16 If A £ B( H) is self-adjoint and B £ B(H) commutes 
with A, the following results hold. 

1. For all bounded measurable functions f on cr(A), the operator f(A) 
commutes with B. 


2. Each spectral subspace for A is invariant under B. 

The proof of this proposition is deferred until Chap. 8. We conclude this 
section by fulfilling (at least for bounded self-adjoint operators) one of 
the goals of the spectral theorem, namely to give a probability measure 
describing the probabilities for measurements of a self-adjoint operator A 
in the state ip. 

Proposition 7.17 Suppose A £ B{ H) is self-adjoint and ip £ H is a unit 
vector. Then there exists a unique probability measure p A on R such that 

[ X m dp A {\) = (iP,A m if) 

Jr 

for all non-negative integers m. 

We will prove a version of Proposition 7.17 for unbounded self-adjoint 
operators in Chap. 9. In the unbounded case, however, we will not obtain 
uniqueness of the probability measure, even if ip is in the domain of A m for 
all m. Even in the unbounded case, however, the spectral theorem provides 
a canonical choice of the probability measure. 

Proof. We define a measure p A on a (A) as in Sect. 7.2.2 by 

T${E) = (ip,p A {E)ip) . 

The properties of integration with respect to p A then tell us that 


(iP,A m if) 




We then extend p A to R by setting it equal to zero on R\a(A), establishing 
the existence of the desired probability measure on R. Since 

i^,Ai ra V’)i<ii^ii 2 p ro ii<iiV’ii 2 iiAiir, 
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the moments grow only exponentially with m. Thus, standard uniqueness 
results for the moment problem (e.g., Theorem 8.1 in Chap. 4 of [18]) give 
the uniqueness of /ijj. ■ 

7.3 Spectral Theorem for Bounded Self-Adjoint 
Operators, II 

As we have already noted in Sect. 6.5, one version of the spectral theorem 
asserts that every self-adjoint operator is unitarily equivalent to a multi¬ 
plication operator. In the case of a bounded self-adjoint operator A , on a 
separable Hilbert space H, this result means that A is unitarily equiva¬ 
lent to the operator Mh on L 2 (X,fi ), where (X, fi) is a a- finite measure 
space, h is a measurable, real-valued function, and Mh is the operator of 
multiplication by h: 

(MfcV’X A) = ft(A)d(A). 

Although the “multiplication operator” form of the spectral theorem 
(Theorem 7.20) has the advantage of being easy to state, there is an even 
better version involving the concept of a direct integral. It is straightforward 
to extend the notion of an L 2 space to an L 2 space with values in a Hilbert 
space H. In a direct integral, we extend the concept one step further, by 
allowing the Hilbert space to depend on the point. We begin with a measure 
space (X,n) and then have one Hilbert space Bp for each A in A. An 
element of the direct integral is a function s on X such that s(A) belongs 
to Ha for each A £ A. Given a real-valued, measurable function h on A, it 
makes sense to multiply an element s of the direct integral by h. 

The direct integral form of the spectral theorem says a bounded self- 
adjoint operator A is unitarily equivalent to a multiplication operator on a 
direct integral. By extending multiplication operators to the more general 
setting of direct integrals (instead of just ordinary L 2 spaces), we gain sev¬ 
eral benefits. First, the set X and the function h become canonical: The 
set A is simply the spectrum of A and the function h is simply h{ A) = A. 
Second, the direct integral approach carries with it a notion of “generalized 
eigenvectors,” since the space Ha can be thought of as the space of gener¬ 
alized eigenvectors with eigenvalue A. (The spaces Ha are not, in general, 
contained in the direct integral Hilbert space. Thus, direct integrals give a 
rigorous meaning to the idea of “eigenvectors” that are not in the Hilbert 
space on which the operator acts.) Third, the direct integral approach gives 
a simple way to classify self-adjoint operators up to unitary equivalence: 
Two self-adjoint operators are unitarily equivalent if and only if their direct 
integral representations are equivalent in a natural sense (Proposition 7.24). 

If one really wants the simplicity of the (ordinary) multiplication operator 
version of the spectral theorem, it is a simple matter to prove this result 
using precisely the same methods as in the proof of the direct integral 
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version. (See Theorem 7.20.) Nevertheless, the direct integral version is, 
arguably, the most definitive version of the spectral theorem for a single 
self-adjoint operator. 

We turn now to the definition of a direct integral. Suppose n is a cr-finite 
measure on a er-algebra of sets in X. Suppose also that for each A € X, 
we have a separable Hilbert space Ha with inner product (•, ■) x . We want 
to define the direct integral of the Ha’s with respect to p,. Elements of the 
direct integral will be sections s, meaning that s is a function on X with 
values in the union of the Ha’s, having the property that 


s(A) e H a 


for each A in X. We would like to define the norm of a section s by the 
formula 



provided that the integral on the right-hand side is finite. The inner product 
of two sections Si and S 2 (with finite norm) should then be given by the 
formula 



The problem with this description of the norm and inner product on 
the direct integral is that we have not said anything about measurability. 
As things stand, it does not make sense to ask whether a section s is 
measurable, since the space in which s(A) takes its values is different for 
each A. We must, therefore, introduce some additional structure that gives 
rise to a notion of measurability. (The measurability issue is a technicality 
that can be ignored on a first reading.) 

One way to address the measurability issue is to choose a simultaneous 
orthonormal basis for each of the Hilbert spaces Ha- To deal with the 
possibility that different spaces can have different dimensions, we slightly 
modify the concept of an orthonormal basis. We say that a family {ey} of 
vectors is an orthonormal basis for a Hilbert space H if ( ej,ek) = 0 for 
j k , the norm of each is either 0 or 1, and the closure of the span 
of the efis is all of H. This just means that we allow some of the vectors 
in our basis to be zero, with the nonzero vectors forming an orthonormal 
basis in the usual sense. 

We now define a simultaneous orthonormal basis for a family {Ha} of 
separable Hilbert spaces to be a collection {e,,-(•)}?!! of sections with the 
property that for each A, {e J (A)}^ 1 is an orthonormal basis for Ha- Pro¬ 
vided that the function A K > dim Ha is a measurable function from X into 
[0, oo], it is possible to choose a simultaneous orthonormal basis (ey (-)} 
such that (ej(A), efc(A)) is measurable for all j and k. Having chosen a si¬ 
multaneous orthonormal basis with this property, we define a section s to 
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be measurable if the function 

A i y (ej(X),s(X)) x 

is a measurable complex-valued function for each j. Our assumption on the 
e/s means that the e/s themselves are measurable sections. 

We refer to a choice of simultaneous orthonormal basis, chosen so that 
(ej(X), efc(A)) is measurable, as a measurability structure on the collection 
of Ha’s. Given two measurable sections si and s 2 , the function 

OO 

A (si(A), s 2 (A)) a = ( s i( A )> CjW) x ( e i(A), s 2 (A)) a 

3= 1 

is also measurable. 

Definition 7.18 Suppose the following structures are given: (1) a a-finite 
measure space (X,Q,p), (2) a collection {Ha}agx °f separable Hilbert 
spaces for which the dimension function is measurable, and (3) a mea¬ 
surability structure on {Ha}ag.y- Then the direct integral of the Ha’s 
with respect to p, denoted 

j 9 Ha dp( A), 

Jx 

is the space of equivalence classes of almost-everywhere-equal measurable 
sections s for which 

INI 2 : = f (s(A),s(A)) a dp{X) < oo. 

Jx 

The inner product (si,s 2 ) of two such sections si and s 2 is given by the 
formula 

(s 1 ,s 2 }:= (si(A),s 2 (A)) a dp(X). 

Jx 

To see that the integral defining the inner product of two finite-norm 
sections is finite, note that |(si(A), s 2 (A)) a | < ||si(A)|| a ||s 2 (A)|| a . By as¬ 
sumption, ||Sj (A) || A is a square-integrable function of A for j = 1,2, and 
the product of two square-integrable functions is integrable. Thus, the inte¬ 
grand in the definition of (si, s 2 ) is also integrable. It is not hard to show, 
using an argument similar to the proof of completeness of L 2 spaces, that 
a direct integral of Hilbert spaces is a Hilbert space. 

Let us think of two important special cases of the direct integral con¬ 
struction. First, if each of the Ha’s is simply C, then the direct integral 
(with the obvious measurability structure) is simply L 2 (X, p). Second, sup¬ 
pose that X = { Ai, A 2 ,.. .} is countable, is the u-algebra of all subsets 
of X , and p is the counting measure on X. Then the direct integral is the 
Hilbert space direct sum (Definition A. 45). 


7.3 Spectral Theorem for Bounded Self-Adjoint Operators, II 147 


Given a direct integral, suppose we have some Ao £ X for which {Ao} 
is measurable and such that c := /i({Ao}) > 0. Then we can embed Ha 0 
isometrically into the direct integral by mapping each if £ Ha 0 to the 
section s given by 



Even if /r({Ao}) = 0, we may still think that Ha 0 is a sort of “generalized 
subspace” of the direct integral. 

Theorem 7.19 (Spectral Theorem, Second Form) If A £ H(H) is 

self-adjoint, then there exists a a-finite measure p on a(A), a direct in¬ 
tegral 





and a unitary map U between H and the direct integral such that 


[UAU-\s)\ (A) = As(A) 


(7.20) 


for all sections s in the direct integral. 

The proof of Theorem 7.19 is given in the next chapter, along with the 
proof of our first version of the spectral theorem. In the meantime, let us 
think about what this version of the spectral theorem is saying. We may 
think that the unitary map U is an identification of our original Hilbert 
space H with a certain direct integral over the spectrum of A. Under this 
identification, the self-adjoint operator A becomes the operator of multi¬ 
plication by A, that is, the map sending the section s(A) to As(A). Roughly 
speaking, then, the operator A acts (under our identification) as XI on 
each space Ha- Thus, we may think of Ha as being something like an 
“eigenspace” for A, for each element A of the spectrum of A. Of course, 
unless p({A }) > 0, the Hilbert space Ha is not actually contained in H. 
Nevertheless, we may think of elements of a given Ha as “generalized eigen¬ 
vectors” for the operator A. 

The direct integral formulation of the spectral theorem leads readily to a 
classification result for bounded self-adjoint operators. See Proposition 7.24 
later in this section. Meanwhile, as we noted earlier in this section, the 
method of proof for Theorem 7.19 also yields a version of the spectral 
theorem involving multiplication operators on ordinary L 2 spaces. 

Theorem 7.20 (Spectral Theorem, Multiplication Operator Form) 

Suppose A £ B( H) is self-adjoint. Then there exists a a-finite measure 
space (A',/x), a bounded, measurable, real-valued function h on X, and a 
unitary map U : H — > L 2 (X,p) such that 


\UAU~\if) ](A) = h( X)if(X) 


for all if £ L 2 (X, p). 


148 7. The Spectral Theorem for Bounded Self-Adjoint Operators... 

We return now to a discussion of the direct integral version of the spectral 
theorem. This version gives a simple description of the functional calculus. 

Proposition 7.21 Suppose A £ Z?(H) is self-adjoint and U is a unitary 
map as in Theorem 7.19. Then for any bounded measurable function f on 
<j(A), we have 

[Uf(A)U-\s)}(X) = f(X)s(X). 

Thus, roughly speaking, f(A) is defined to be /(A)/ on each “generalized 
eigenspace” Ha- Proposition 7.21 follows directly from (7.20) if / is a poly¬ 
nomial; the result for continuous / then follows by taking uniform limits. 
The result for general / is then easily established by using the limiting 
arguments of Chap. 8, especially Exercise 3. 

Let us now consider what sort of uniqueness there should be in the second 
version of the spectral theorem. There is a “trivial” source of nonuniqueness 
coming from the possibility that some of the Ha’s may have dimension 0. 
Let E 0 denote the set of A for which dim Ha = 0. Even if p(E 0 ) > 0, the set 
E 0 makes no contribution to the norm of a section, since every section is 
automatically zero on E$. Thus, we may define a new measure fi by setting 
p(E) = p(E fl Eq), so that fi agrees with p on E£ but is zero on E 0 . Then 
the direct integrals of the Ha’s with respect to p and with respect to p are 
“indistinguishable.” Thus, we can always modify a direct integral so as to 
assume that dim Ha > 0 for almost every A. 

Meanwhile, unlike the projection-valued measure p A in Theorem 7.12, 
the measure p in Theorem 7.19 is not unique, but only unique up to equiva¬ 
lence, where two tr-finite measures on a given measurable space are equiva¬ 
lent if they have precisely the same sets of measure zero. For a given measure 
p , the Hilbert spaces Ha are unique only up to unitary equivalence, mean¬ 
ing that only the dimension of the spaces is uniquely determined. Even 
the dimension of Ha is uniquely determined only up to a set of ^-measure 
zero. As it turns out, the sources of nonuniqueness in this paragraph and 
the previous paragraph are all that exist. 

Proposition 7.22 (Uniqueness in Theorem 7.19) Suppose A £ B( H) 
is self-adjoint and consider two different direct integrals as in Theorem 7.19, 
one with measure p W and Hilbert spaces H^ 1 -* and the other with mea¬ 
sure pand Hilbert spaces H^ 2 \ 7/dimH^ > 0 for p^A-almost every X 
(j = 1,2), then /A 1 ) and p^ are mutually absolutely continuous and 

dim H^ = dim H^ 2) 
for p^ -almost every X (j = 1,2). 

See the end of the next chapter for a sketch of the proof of this uniqueness 
result. 

Theorem 7.19 should be thought of as a refinement of our earlier form 
(Theorem 7.12) of the spectral theorem, in the sense that we can easily 
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recover Theorem 7.12 from Theorem 7.19. In the setting of Theorem 7.19, 
and given a measurable set E C o(A), let Ve denote the space of (equiv¬ 
alence classes) of sections s that are supported on E, that is, for which 
s(A) = 0 for /Lt-almost every A in E c . This is easily seen to be a closed 
subspace. Let Pe denote the orthogonal projection onto Ve, and define 

H A (E) = U~ 1 P E U. (7.21) 

It is straightforward to check that p A is a projection-valued measure on 
c(A), with values in 0(H), and that f a ^ A dp A ( A) = A. 

Note that both versions of the spectral theorem for A involve a measure, 
the first, denoted p A , being a projection-valued measure, and the second, 
denoted p, being an ordinary measure with values in the non-negative real 
numbers. The following result shows the relationship between the two mea¬ 
sures. 

Proposition 7.23 Suppose A £ 0(H) is self-adjoint, p A is the projection¬ 
valued measure given by Theorem 7.12 and p is a real-valued measure as 
in Theorem 7.19. //dimH a > 0 for p-almost every A, then for any Borel 
set E C o(A), p A (E) = 0 if and only if p(E) = 0. 

Of course, the 0 in the expression p A (E) = 0 is the zero operator , whereas 
the 0 in the expression p(E) = 0 is the number 0. Nevertheless, we may 
think of Proposition 7.23 as saying that p A and p are equivalent in the 
usual measure-theoretic sense, having precisely the same sets of measure 
zero. 

Proof. As we have remarked, given a direct integral as in Theorem 7.19, 
we can construct a projection-valued measure by means of (7.21), and this 
projection-valued measure satisfies A dp A ( A) = A. This projection¬ 
valued measure must coincide with the one in Theorem 7.12, by the unique¬ 
ness in that theorem. 

Now, if p(F) = 0, then any section supported on E is zero almost every¬ 
where and thus represents the zero element of the direct integral. In that 
case, Ve = 0 and so p A {E) = 0 by (7.21). In the other direction, suppose 
p{E) > 0. Since p is u-finite, E will contain a measurable subset F such 
that 0 < p(F) < oo. Then let s be the section given by 

oo 1 

S (A) = E^(A) 

i=i 

for A £ F and s(A) = 0 for A £ F c , where {e,j(•)} is our measurability 
structure for the direct integral. Then 

(s(X),e j {X)) x = (e j {X),e j (X)) x l F (X), 

which is a measurable function of A for all j , so that s is measurable. Since 
we assume that H^ has nonzero dimension for /r-almost every A, s will be 
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nonzero almost everywhere on F and thus will have positive norm. The 
norm of s is finite because ||s(A)|| < 1 and F has finite measure. Thus, 
V E ^0 and p A (E) ± 0. ■ 

We say that self-adjoint operators A\ and A 2 on Hilbert spaces Hi and 
H 2 are unitarily equivalent if there exists a unitary map U : Hi — > H 2 
such that 

A 2 = UAJJ- 1 . 

Using Proposition 7.22, we can give a classification of bounded self-adjoint 
operators on separable Hilbert spaces up to unitary equivalence. For a given 
bounded self-adjoint operator A, we call the function A 1 —»• dim Ha the 
multiplicity function for A. It is well defined (independent of the choice of 
direct integral decomposition) up to a set of measure zero. It turns out that 
bounded self-adjoint operators are characterized, up to unitary equivalence, 
by the spectrum of A as a set, the equivalence class of the measure p in 
Theorem 7.19, and the multiplicity function. 

Proposition 7.24 Suppose A\ and A 2 are bounded self-adjoint operators 
on separable Hilbert spaces Hi and H 2 , respectively. Choose direct integral 
representations for A\ and A 2 as in Theorem 7.19, with the associated 
measures p 1 and p 2 chosen so that dim Ha > 0 for pj-almost every A 
(j = 1,2). Then A\ and A 2 are unitarily equivalent if and only if the 
following conditions are satisfied. 

1. cr(Ai) = a(A 2 ). 

2. The measures p\ and p 2 are mutually absolutely continuous. 

3. The multiplicity functions of A\ and A 2 coincide up to a set of mea¬ 
sure zero. 

See Exercise 12 for a proof of this result. 


7.4 Exercises 

1. Suppose A and B are commuting linear operators on a nonzero finite¬ 
dimensional vector space. 

(a) Show that each eigenspace for A is invariant under B. 

(b) Show that A and B have at least one simultaneous eigenvector, 
that is, a nonzero vector v with Av = Xv and Bv = pv, for some 
constants A, p G C. 

2. Suppose that A G Z?(H) is normal , meaning that AA* = A*A. Sup¬ 
pose that for some if G H and A G C we have Aif = Xip. Show that 
A*%!> = A ip. 

Hint: Compute ||(A* — A)^||. 
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3. Suppose a closed subspace V of H is invariant under a bounded oper¬ 
ator A, meaning that Aip G V for all ip G V. Show that the orthogonal 
complement V 1 - of V is invariant under A*. 

4. (a) Suppose that H is a finite-dimensional Hilbert space over C and 

A is a normal linear operator on H in the sense of Exercise 2. 
Show that there exists an orthonormal basis for V consisting of 
simultaneous eigenvectors for A and A*. 

Hint: Use Exercises 1 and 3. 

(b) Suppose A is a linear operator on a finite-dimensional Hilbert 
space H over C and suppose there exists an orthonormal basis 
for V consisting of eigenvectors of A. Show that A commutes 
with A*. 

5. Suppose A G 0(H) has an inverse A -1 in 0(H). Show that (A -1 )*A* 
= A*(A -1 )* = I. Conclude that A* is invertible and (A*) -1 = (A -1 )*. 

6. Suppose U is a unitary operator on H (Definition A.55). Show that 
the spectrum of U is contained in the unit circle. 

Hint: By writing U — XI as (—A)(J — U/X) or as U(I — XU -1 ), show 
that any A with |A| ^ 1 is in the resolvent set of A. 

7. Suppose that A G 0(H) is self-adjoint and non-negative, that is, that 
A satisfies (7.3). Show that the spectrum of A is contained in the 
interval [0, oo). 

Note: Conversely, if A G 0(H) is self-adjoint and <r(A) C [0, oo), then 
A is non-negative. See Exercise 2 in Chap. 8. 

8. Suppose A € 0(H) is invertible. Show that there exists e > 0 such 
that for all B G 0(H) with \\B — A|| < e, B is also invertible. 

Hint: Use a power series argument as in the proof of Proposition 7.5. 

9. Assume A G 0(H) is self-adjoint. 

(a) Suppose Ao G C is a point in the resolvent set of A. Show that 

|| (A - A 0 /) _1 || = —- -T-TTT-, 

d(A 0 ,cr(A)) 

where d(A 0 ,<r(A)) = inf Ae<T(A) |A - A 0 |. 

Hint: Think of (A — Ao/) -1 as a function of A in the sense of 
the functional calculus for A. 

(b) Given Ao G C, suppose that there exists some nonzero ip G H 
such that 

II Aip - Ao^|| < £ Ml ■ 

Show that there exists A G <r(A) such that |A — Aq| < e. 
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10. Suppose Vj and V 2 are two closed subspaces of H, with associated 
orthogonal projections Pi and Pi . Show that V\ and Vi are orthogonal 
if and only if P 1 P 2 = 0. 

11. Suppose n is a projection-valued measure on (X, ft). Show that for 
any Ei,E 2 £ il, ^(Ei)/j,(E 2 ) is the projection onto the closed sub¬ 
space Range(^(Pi)) fl Range(/r(P 2 ))- 

Hint: Write Ei as Ei = (Pi fl E 2 ) U (Ei\E 2 ) and use Exercise 10. 

12. Prove Proposition 7.24. 

Hint: Use Proposition 7.22 and the Radon-Nikodym theorem 
(Theorem A.6). 


8 

The Spectral Theorem for Bounded 
Self-Adjoint Operators: Proofs 


In this chapter we give proofs of all versions of the spectral theorem stated 
in the previous chapter. 


8.1 Proof of the Spectral Theorem, First Version 

A proof of the spectral theorem, in its projection-valued measure form, can 
be obtained in two main stages. The first stage of the proof is to define a 
continuous functional calculus, meaning we associate with each continuous 
function / on a (A) an operator f(A). The map / i-*- f(A) should have the 
property that if / is the function /(A) = A m , then f(A) = A m . The contin¬ 
uous functional calculus is then constructed by approximating continuous 
functions on a (A) by polynomials. The Stone-Weierstrass theorem tells us 
that polynomials are dense in the continuous functions on < 7 (A); it remains 
only to show that if a sequence p n of polynomials converges uniformly to 
some continuous function / on a (A), then the operators p n ( A) converge to 
some operator, which we will then call f(A). 

The second stage of the proof is to show that the continuous functional 
calculus can be represented as integration against a projection-valued mea¬ 
sure. This result is just an operator-valued version of the Riesz represen¬ 
tation theorem from measure theory (Theorem 8.5). Indeed, we will see 
that this operator-valued version of the Riesz representation theorem can 
be reduced to the usual form of the theorem. 
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8. The Spectral Theorem for Bounded Self-Adjoint Operators: Proofs 


8.1.1 Stage 1: The Continuous Functional Calculus 

We begin by defining, for any A £ 13(H), the spectral radius R{A) by 

R{A) = sup |A|. 

A Ecr(A) 

(By Propositions 7.5 and 7.7, cr(A) is a nonempty, bounded subset of R.) 
According to Point 2 of Proposition 7.5, we have 

R(A) < \\A\\ 

for any A £ £>(H). In general, ||A|| can be much bigger than R(A). For ex¬ 
ample, if A is a nilpotent matrix, then R(A) = 0 but ||A|| can be arbitrarily 
large. 

Lemma 8.1 If A £ B( H) is self-adjoint, the norm and the spectral radius 
of A are equal: 

Mil = R(A). 

In preparation for the proof, we determine the radius of convergence of 
the power series for the resolvent given in the proof of Proposition 7.5. 
According to Proposition 7.2, we have 

\\A*A\\ = ||A|| 2 

for any A £ £>(H). If A is self-adjoint, we obtain 

KIbMIl 2 - 


Iterating this relation gives 


= Mil 2 " 


( 8 . 1 ) 


for all n. 

Consider, for a bounded self-adjoint operator A, the following formal 
expression for the resolvent of A: 


(A -XI)- 1 



E 

m =0 


A m 
\ m+1 ' 


( 8 . 2 ) 


if l^l > ||A||, then the proof of Proposition 7.5 shows that the series (8.2) 
converges in the operator norm topology and that the sum of the series is 
indeed the inverse of ( A — XI). If, on the other hand, |A| < ||A||, it follows 
from ( 8 . 1 ) that the norms of the terms in ( 8 . 2 ) do not tend to zero, and 
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so the series cannot converge in the operator norm topology. We may say, 
then, that the series (8.2) has radius of convergence equal to ||A||. 

Proof of Lemma 8.1. We know that R(A) < ||A||. To show that R(A) = 
||A||, we wish to argue that ( A — A/) -1 is a holomorphic operator-valued 
function of A on the set |A| > R(A), and therefore the Laurent series 
of {A — A I)- 1 must converge for |A| > R(A). But the Laurent series of 
(.A — A/) -1 is just the series in (8.2), and we have shown that the series 
diverges when |A| < ||A||. This would be a contradiction if R(A) were less 
than ||A||. 

To flesh out the argument, recall the formula (7.8) in the proof of Propo¬ 
sition 7.5 for the resolvent of A. 

That formula expresses the map A i—>• (^4 — A/) -1 as a convergent power 
series in powers of A — Ao, near any point Ao in the resolvent set of A. It 
follows that for any bounded linear functional £ £ 0(H)*, the complex¬ 
valued function 

A ^£((,4 -XI)- 1 ) 

is holomorphic on the resolvent set of A. This function has a unique Laurent 
series, which is given by applying £ term by term to (8.2). The series will 
converge on the largest annulus contained in the resolvent set of A, namely 
the set of A with |A| > R(A). 

Convergence of (8.2) means that |£(A m /A m+1 )| is bounded as function 
of to, for each £ and each A with |A| > R(A). Thus, by (a corollary of) the 
uniform boundedness principle (Appendix A.3.4), the set {A m /A m+1 }^ =0 
is bounded in the Banach space 0(H), for all A with |A| > R(A). In par¬ 
ticular, for each A with |A| > R(A), there is a constant C such that 

= <a 

|A| 2 |A| 2 - 

If || A|| were greater than R(A), this inequality would be false for A satisfying 
R(A) < |A| < ||A||. ■ 

The next key step in Stage 1 of the proof is to understand how the 
spectrum of a self-adjoint operator transforms under application of a poly¬ 
nomial. 

Lemma 8.2 (Spectral Mapping Theorem) For all A £ 0(H) and all 

polynomials p, we have 


(i(p(A)) = p(cr(A)). 

That is to say, the spectrum of p(A) consists precisely of the numbers of 
the form p{ A), with A in the spectrum of A. 

Proof. The result is trivial if p is constant. When degp > 1, let p given by 


p(z) — a n z n + a n —iz n 1 + • • • + ao 
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be an arbitrary polynomial. We first show that p(a(A)) C (r(p(A)). 
Suppose, then, that A G a {A). Observe that 

p(A) - p(X)I = a n (A n - A"/) + a n - 1 (A n ~ 1 - A 71-1 /) + • • • + a 0 I - a 0 I. 
Now, 

A k - X k I = (A - A/)(A fe_1 + A A k ~ 2 + A 2 A k ~ 3 + • • • + A fc_ 1 J). 

Thus, we can pull out a factor of (A — A/) from each nonzero term in 
p(A)-p(X)I, giving 

p{A) - p(X)I = (A - XI)q(A) 

where q is a polynomial (depending on A). Since, by assumption, A — XI is 
not invertible, and since (A—XI) commutes with q(A), (A—XI)q(A) cannot 
be invertible (Exercise 1). This shows that p( A) belongs to the spectrum of 
p(A). 

We now show that a(jp{A)) C p(a(A)). Suppose, then, that 7 G cr(p(A)). 
Since C is algebraically closed, we can factor the polynomial p(z) — 7 , as a 
function of z, as 


p(z) -7 = c(z - bi)(z - 62 ) • • • (z - b n ). (8.3) 


Thus, 

P(A) -7 / = c(A - hI)(A - b 2 I) ■■■(A- b n I). 

Since p(A) — 7 1 is assumed to be noninvertible, there must be some j such 
that (A — bjl) is noninvertible, that is, for which bj G <r(A). Then (8.3) 
tells us that p{bj) —7 = 0 , meaning that 7 = p(bj). Thus, 7 is of the form 
p{ X) for some A (= bj) in a (A), m 

The last step in Stage 1 of our proof is to apply the Stone-Weierstrass 
theorem to show that polynomials are dense in C(<r(A);R) (the space of 
continuous, real-valued functions on a (A)) with respect to the supremum 
norm. 

Proposition 8.3 Suppose A G B(H) is self-adjoint. Then there exists a 
unique bounded linear map from C(<r(A);M) into B(H), denoted by f ^ 
f(A), such that when /(A) = X m , we have f(A) = A m . The map f 1 —^ f(A), 
f G C(<j(A);M), is called the (real-valued) functional calculus for A. 

Proof. Note that if A is self-adjoint, then p(A) is self-adjoint provided 
that p is a real-valued polynomial (i.e., one where all the coefficients are 
real numbers). Thus, combining the spectral mapping theorem with the 
equality of the norm and spectral radius, we have the following: If A is a 
self-adjoint operator and p is a real-valued polynomial, then 


lb(A)|| = sup b(A)|. 

A ecr(A) 


(8.4) 
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Thus, the map p —> p(A) is an isometric linear map from the space of 
polynomials on a (A) (with the supremum norm) into the space of bounded 
operators on H. 

According to the Stone-Weierstrass theorem polynomials are dense in 
C(cr(A);M). Thus, by the BLT theorem (Theorem A.36), we can extend the 
map p i—^ p(A) uniquely to a bounded linear map of C(cr(A); R) into B( H). 


Proposition 8.4 If A £ Z?(H) is self-adjoint, the (real-valued) continuous 
functional calculus for A, mapping C(a (A); M) into B{ H), has the following 
properties. 

1. Multiplicativity: For all f,g, we have 

(fg)(A) = f(A)g(A), 

where fg denotes the pointwise product of f and g. 

2. Self-adjointness: For all f , the operator f(A) is self-adjoint. 

3. Non-negativity: For all f, if f is non-negative, then f(A) is a non¬ 
negative operator. 

J t . Norm and spectrum properties: For all f, we have 

11/04)11= sup |/(A)| (8.5) 

Aecr(A) 

and 

a(f(A)) = {f(\)\\eo(A)}. (8.6) 

Proof. Point 1 holds for polynomials and thus, by taking limits, for all 
/ £ C(a(A);R). Furthermore, if p is a real-valued polynomial and A is 
self-adjoint, then p(A) is self-adjoint. From this, we get Point 2 by taking 
limits. If f £ C(cr(A);R) is non-negative, then / = g 2 , where g = yjf is 
real-valued. Thus, g(A) is self-adjoint and for all ip £ H, Point 1 tells us 
that 

(ip,f(A)ip) = (■ ip,g(A) 2 ip) = (g(A)ip, g(A)ip) > 0, (8.7) 

which establishes Point 3. We have already established (8.5) in (8.4) for 
polynomials; the result for general / £ C(a(A);M.) follows by taking limits. 

To establish ( 8 . 6 ), suppose first that Ao £ C is not in the range of /. 
Then the function g( A) := l/(/(A) — A 0 ) is continuous on a (A) and the 
operator g(A) will be the inverse of f(A) — Ao I, showing that Ao is not in 
the spectrum of f(A). 

In the other direction, suppose that Ao = f(p) for some p £ cr(A); we 
want to show that f(p) £ cr(f(A )). Suppose now that /(A) — f{p)I were 
invertible and choose a sequence p n of polynomials converging uniformly 
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to / on a(A). By Exercise 8 in Chap. 7, any operator sufficiently close to 
f(A) — f(g)I in the operator norm topology would also be invertible. In 
particular, p n (A) — p n (p)I would have to be invertible for all sufficiently 
large n , contradicting the spectral mapping theorem. ■ 

8.1.2 Stage 2: An Operator-Valued, Riesz Representation 
Theorem 

We turn now to Stage 2 of the proof of the spectral theorem. We will make 
use of the Riesz representation theorem from measure theory (not the result 
about continuous linear functionals on a Hilbert space). The following form 
of this result is sufficient for our purposes. 

Theorem 8.5 (Riesz Representation Theorem) Let X be a compact 
metric space and let C(X;K) denote the space of continuous, real-valued 
functions on X. Suppose A : C(X;R) —> R. is a linear functional with the 
property that A(/) is non-negative whenever all the values of f are non¬ 
negative. Then there exists a unique (real-valued, positive) measure p on 
the Borel a-algebra in X for which 



for all f <E C(X;R). 

See pp. 353-354 of Volume I of [34] for a short proof in the case in which 
X is a compact subset of R, which is all we really require. For the full result 
stated above, see Theorems 7.2 and 7.8 in [12]. Observe that p is a finite 
measure, with p(X) = A(l), where 1 is the constant function. 

Given a bounded self-adjoint operator A e 0(H), we have constructed, 
in the previous subsection, a continuous functional calculus for A. This 
calculus is a map, denoted / i-)- f{A), from C(a(A);R) into 0(H). If / e 
C(<r(A);R) is non-negative, then (Point 3 of Proposition 8.4) f{A) is a non¬ 
negative operator. Thus, given if £ H, if we define a linear functional A^, 
on C(cr(A);R) by the formula 


A V-(/) = (V’ ! /( a )'0) 


A^ will satisfy the hypotheses of the Riesz representation theorem. Thus, 
for each if £ H, we obtain a unique measure p^ such that 



/(A) dp^X) 


( 8 . 8 ) 


for all / G C{a(A); R). Note that 


Ptp(<j(A)) = A v ,(l) = HV’II 2 • 


(8.9) 
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Definition 8.6 If f is a bounded measurable (complex-valued) function on 
<y(A), define a map Qf : H — > C by the formula 

QfW = [ /(A) d^{ A), 

Ja(A) 


where py> is the measure in (8.8). 

If / happens to be real valued and continuous, then Qf(ip) is equal 
(ip, f(A)ip), in which case Qf is a bounded quadratic form. (See Defini¬ 
tion A.60 and Example A.62.) It turns out that Qf is a bounded quadratic 
form for any bounded measurable /, in which case Proposition A.63 allows 
us to associate with Qf a bounded operator, which we denote by f(A). 
Once the relevant properties of f(A) are established, we will construct the 
desired projection-valued measure by setting p A (E) = 1 e(A). 

Proposition 8.7 For any bounded measurable function f on ct(A) ; the 
map Qf in Definition 8.6 is a bounded quadratic form. 

Proof. Let T denote the space of all bounded, Borel-measurable func¬ 
tions / for which Qf is a quadratic form. Then T is a vector space and 
contains C(u(A);K). Furthermore, T is closed under uniformly bounded 
pointwise limits, because Qf(ip) is continuous with respect to such limits, 
by dominated convergence. Standard measure-theoretic techniques (Exer¬ 
cise 3) then show that T is the space of all bounded Borel-measurable 
functions on X. 

Meanwhile, it follows from (8.9) that 

\Qf(ip)\< sup |/(A)| HV’H 2 , 

AGct(A) 

showing that Qf is always a bounded quadratic form. ■ 

Definition 8.8 For a bounded measurable function f on cr(A), let f(A) be 
the operator associated to the quadratic form Qf by Proposition A.63. This 
means that f(A) is the unique operator such that 


(ip,f(A)ip) = Q f (ip) = [ f dp^ 
J<j{A) 


for all ip <E H. 

Observe that if / is real valued, then Qf(ip) is real for all ip € H, which 
means (Proposition A.63) that the associated operator f(A) is self-adjoint. 
We will shortly associate with A a projection-valued measure p A , and we 
will show that f(A), as given by Definition 8.8, agrees with f(A) as given 
by /(A) dp A ( A). [See (8.10) and compare Definition 7.13.] 
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Proposition 8.9 For any two bounded measurable functions f and g, we 
have 

(fg)(A) = f(A)g(A). 

Proof. Let T\ denote the space of bounded measurable functions / such 
that ( fg)(A ) = f(A)g(A) for all g £ C(cr(A);R). Then F\ is a vector space 
and contains C(cr(A);R). We have already noted that dominated conver¬ 
gence guarantees that the map / i —> Qf(ip ), if £ H, is continuous un¬ 
der uniformly bounded pointwise convergence. By the polarization identity 
(Proposition A.59), the same is true for the map / i—>■ \ ip), where Lf is 

the sesquilinear form associated to Qf. Now, by the polarization identity, / 
will be in T\ provided that 

i^,(fg)(A)ip) = (ipJ(A)g(A)ip) 

or, equivalently, 

Qfg(4 ’) = L f (ip,g{A)ip) 

for all ip £ H and all g £ C(<r(A);R). From this, we can see that F\ is 
closed under uniformly bounded pointwise limits. Thus, by Exercise 3, T\ 
consists of all bounded, Borel-measurable functions. 

We now let T 2 denote the space of all bounded, Borel-measurable func¬ 
tions / such that ( fg){A ) = f(A)g(A) for all bounded Borel-measurable 
functions g. Our result for T\ shows that T 2 contains C(cr(A);K). Thus, 
the same argument as for T\ shows that consists of all bounded, Borel- 
measurable functions. ■ 


Theorem 8.10 Suppose A £ £>(H) is self-adjoint. For any measurable set 
E C tr(A), define an operator p A {E) by 

g A {E) = 1 e (A), 

where 1 e(A) is given by Definition 8.8. Then f.i A is a projection-valued 
measure on cr(A) and satisfies 



= A. 


Theorem 8.10 establishes the existence of the projection-valued measure 
in our first version of the spectral theorem (Theorem 7.12). 

Proof. Since 1 e is real-valued and satisfies 1_e • 1e = 1b, Proposition 8.4 
tells us that 1 e{A) is self-adjoint and satisfies 1 e{A) 2 = 1 e{A). Thus, 
g A (E) is an orthogonal projection (Proposition A.57), for any measurable 
set E C X. If Ei and E 2 are measurable sets, then lE 1 nE 2 = Ifii • 1e 2 
and so 

p A (E 1 nE 2 )=ix A (E 1 )p A (E 2 ). 

If Ei,E 2 , ... are disjoint measurable sets, then p. A (Ej)p A (Ek)=n A (0)= 0, 
for j ^ k , and so the ranges of the projections p A (Ej) and g A (Ek) are 
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orthogonal. It then follows by an elementary argument that, for all if> £ H, 
we have 

OO 

J2v a (e j W = p^, 

3 =1 

where the sum converges in the norm topology of H and where P is the 
orthogonal projection onto the smallest closed subspace containing the 
range of p A (Ej) for every j. On the other hand, if E := UyLr-Ej, then 
the sequence /at := Y^jLi 1 % is uniformly bounded (by 1) and converges 
pointwise to \e- Thus, using again dominated convergence in (8.8), 


lim 

N—too 


N 




3 =1 


1 e(A)i/j) . 


It follows that Ie(A) coincides with P, which establishes the desired 
countable additivity for p A . 

Finally, if / = 1e for some Borel set E, then 



f( A ), 


( 8 . 10 ) 


where /(A) is given by Definition 8.8. [The integral is equal to p, A (E ), which 
is, by definition, equal to \e{A).] The equality (8.10) then holds for simple 
functions by linearity and for all bounded, Borel-measurable functions by 
taking limits. In particular, if /(A) = A, then the integral of / against p A 
agrees with f(A) as defined in Definition 8.8, which agrees with f(A) as 
defined in the continuous functional calculus, which in turn agrees with 
/(A) as defined for polynomials— namely, /(A) = A. This means that 



A dp A (X) = A 


as desired. ■ 

We have now completed the existence of the projection-valued measure 
fi A in Theorem 7.12. The uniqueness of p A is left as an exercise (Exercise 4). 
We close this section by proving Proposition 7.16, which states that if a 
bounded operator B commutes with a bounded self-adjoint operator A, 
then B commutes with /(A), for all bounded, Borel-measurable functions 
/ on <j(A). 

Proof of Proposition 7.16. If B commutes with A, then B commutes 
with p(A ), for any polynomial p. Thus, by taking limits as in the construc¬ 
tion of the continuous functional calculus, B will commute with /(A) for 
any continuous real-valued function / on a (A). We now let T denote the 
space of all bounded, Borel-measurable functions / on <r(A) for which /(A) 
commutes with B , so that C(cr(A);K). 
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To show that a bounded measurable / belongs to T, it suffices to show 
that for all AfigHwe have (6, f(A)Bip) = (d>, B f(A)ip), or, equivalently, 
(0, f(A)Bip) = (5*0, /(A)0). That is, we want 


L/(0, 50) = L f (B*cp,ip). 


But we have seen that for fixed vectors 01,02 £ H, the map / i—>■ L/(0 i,02) 

is continuous under uniformly bounded pointwise limits. Thus, T is closed 
under such limits, which implies (Exercise 3) that J- contains all bounded, 
Borel-measurable functions. ■ 


8.2 Proof of the Spectral Theorem, Second Version 

We now turn to the proof of Theorem 7.19. As in the proof of Theorem 7.12, 
we will make use of continuous functional calculus for a bounded self-adjoint 
operator A and the Riesz representation theorem. We begin by establishing 
the special case in which A has a cyclic vector, that is, a vector ip with 
the property that the vectors A k ip , k = 0,1, 2,..., span a dense subspace 
of H. In that case, the direct integral will be simply an L 2 space (i.e., the 
Hilbert spaces H> are equal to C for all A). Thus, in this special case, the di¬ 
rect integral and multiplication operator versions of the spectral theorem 
coincide. 

Lemma 8.11 Suppose A € 5(H) is self-adjoint and ip is a cyclic vector 
for A. Let p^ be the unique measure on a (A), given by Theorem 8.5, for 
which 



( 8 . 11 ) 


Ja(A) 

for all f £ C(er(A);]R). Then there exists a unitary map 

U :H^ L 2 (a{A),^) 


such that 


[UAU- 1 #] (A) = A0(A) 


for all (p £ L 2 (a(A), p^). 

Proof. We start by defining U on the complex vector space of vectors of 
the form p(A)ip , where p is a complex-valued polynomial, as follows: 


U[p(A)ip] = p. 

To show that U is well defined, write p as p = p\ + ip 2 , where p\ and p 2 
are real-valued polynomials. Since Pi{A) and P 2 (A) are self-adjoint and 
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commuting, we obtain 

( v{A)ip,p(A)ip ) = (ip, [pi{A) 2 +p 2 (A) 2 ] ip) 

= [ [pi(A) 2 +P2W 2 ] dp^( A), (8.12) 

Ja(A) 

by canceling cross terms and applying (8.11). Thus, if p(A)ip = 0 in H, 
then p( A) = 0 for ^-almost every A in a(A), so that p represents the zero 
element of L 2 (a(A), p^,). 

Equation (8.12) shows also that the map U is isometric on its initial 
domain. This initial domain is dense in H since it contains the vectors 
A k ip and ip is cyclic. Thus, the BLT theorem (Theorem A.36) tells us that 
U extends uniquely to an isometric map of H into L 2 (a(A), p^f). Since 
polynomials are dense in L 2 (a(A), p^p) (by the Stone-Weierstrass theorem 
and Theorem A. 10), U actually is unitary. 

Now, since U takes A k ip to the function A H in L 2 {a(A), p^), we 
have that U AU~ 1 (X k ) = A fe+1 . Thus, 

[UAU~ 1 p](X) = Xp(X) 

for all polynomials p. Since polynomials are dense in L 2 (a(A), p^f), we have 
[UAU~ 1 cp\( A) = Xcp(X) for all cp £ L 2 (a(A), p^), as claimed. ■ 

Lemma 8.12 Suppose A £ is self-adjoint and p A is the associated 

projection-valued measure on tr(A), as in Theorem 8.10. Then there exists 
a non-negative real-valued measure p on a(A) such that for all Borel sets 
E C tr(A), we have p A (E) = 0 if and only if p(E) = 0. 

Proof. Let {e^} be an orthonormal basis for H and let p e . be the associated 
real-valued measures, given by p ej (E) = (ej, p A (E)ej). Then p ej (a(A)) = 
(ej,Iej) = 1 for all j. Thus, the formula 

d 1 := fXde, 

j J 

defines a finite measure on a{A). Given some Borel set E C cr(A), if 
p A (E) = 0, then p ej (E) = 0 for all j and so p(E) = 0. Conversely, if 
p{E) = 0, then 

0 = (e j ,p A (E)ej) = (p A {E)e j ,p A (E)e j ) 

for all j, since p A {E) is self-adjoint and p A (E) 2 = p A (E). Thus, p A (E)ej = 
0 for all j, which means that p A {E ) =0. ■ 

Lemma 8.13 If A £ £>(H) is self-adjoint, then H can be decomposed as 
an orthogonal direct sum of closed nonzero subspaces Wj, where each Wj is 
invariant under A and where the restriction of A to Wj has a cyclic vector 
ipj. The number of Wj ’s is either finite or countably infinite. 
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Proof. Recall our standing assumption that H is separable, and let {4>j} 
be a countable dense subset of H. Let W± be the closed subspace of H 
spanned by <j> i, A(j> i, A 2 (f> i, .... Then W\ is invariant under A and ipi := (f> i 
is a cyclic vector for A\ w . If W\ = H then we are done. If not, let j be 
the smallest number such that (f)j is not contained in W\. Let ip 2 be the 
orthogonal projection of <f>j onto the orthogonal complement of W\, and let 
W -2 be the closed span of ip 2 , Aip 2l A 2 ip 2 , • • •■ Then W 2 is invariant under A 
and ip 2 is a cyclic vector for A\ w ^ . Furthermore, since A is self-adjoint and 
leaves W\ invariant, it also leaves W A invariant, which means that A k ip 2 
is orthogonal to Wx for all k, so that W 2 is orthogonal to W\. 

If, now, W x © W 2 = H, we are done. If not, we let k be the smallest 
number such that (pk is not in W\ ® W 2 and we let ip 3 be the projection 
of (pk onto the orthogonal complement of W\ © W 2 , and so on. Continuing 
on in this way, we obtain an orthogonal collection of closed subspaces that 
are invariant under A, each of which has a cyclic vector. Either the process 
terminates with finitely many of these subspaces spanning H, or we get an 
infinite family. In the latter case, each (pj belongs to the span of the Wj ’s 
and hence the (Hilbert space) direct sum of the Wj ’s is all of H. ■ 

We are now ready for the proof of our second form of the spectral theo¬ 
rem. 

Proof of Theorem 7.19. Let {Wj. ip j } be as in Lemma 8.13, and let Aj 
denote the restriction of A to Wj, which is a bounded self-adjoint operator 
on the Hilbert space Wj . For each Aj , we can obtain a unitary map Uj as in 
Lemma 8.11, and we wish to piece these maps together for different values 
of j to obtain a direct integral decomposition for all of H. To facilitate 
piecing the maps together, we will modify the Uj ’s so that they all map to 
L 2 spaces over a subset of a(A) with respect to the same measure /i. 

If we apply Lemma 8.11 to Aj, we get a unitary map 

Uj ■ Wj —> L 2 (cr(Aj), 

such that UjAU~ 1 is the operator of multiplication by A. Here, is the 
measure on cr(Aj) given by = (ipj, ji Aj (E)ipj). Now, according to 

Exercise 5, the spectrum of Aj is contained in the spectrum of A. Fur¬ 
thermore, if E is a measurable subset of cr {Aj) C cr(A), then 1 e may be 
thought of as a measurable function either on cr(Aj) or on a (A). Exercise 5 
tells us that lsiAj), as defined by the functional calculus for Aj, coincides 
with the restriction to Wj of 1 e{A). Thus, if 1e{A) = 0 then 1 _e(A,) = 0 
as well. Equivalently, if ^ A {E) = 0 then [i Aj (E ) = 0, where n Aj is the 
projection-valued measure associated to the self-adjoint operator Aj. 

Let us now choose a measure /j, as in Lemma 8.12. Any set of measure 
zero for /1 is a set of measure zero for / j ,' 4 and thus also for and then 
for /x,/j.. Thus, if we extend fj to a measure on a(A) by making it zero on 
<j(A) \ <j{Aj), we have that n^ j is absolutely continuous with respect to /r. 
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By the Radon-Nikodym theorem (Theorem A.6), each pp . has a density 
Pj with respect to p, and this density is nonzero p^ .-almost everywhere. 
Now, the map 

/ ^ p) /2 f 

is easily seen to be a unitary map of L 2 (a(Aj ), p to L 2 (a(Aj), p). Thus, 
we can define a unitary map 

Uj ■ Wj —> L 2 (a(Aj), p) 


by setting 

Since multiplication by (pj) 1 ^ 2 commutes with multiplication by A, we have 

(UjA.U- 1 ) (V0(A) = AV-(A). 

Now, L 2 (a(Aj), p) can be thought of as a direct integral over a(A) with 
respect to p, where we take = C for A £ a(Aj) and we take = {0} 
if A £ cr(Aj) c . We now define another direct integral over a(A) in which 
the Hilbert spaces Ha, A £ cr(A), are defined by 

Ha = ©Hi. 

j 


Here the measurable structure on the direct integral is defined by setting 


f (0, 0,..., 1, 0, 0,...), A £ Ej 
\ (0,0,...,0,0,0,...), A ££J 


where the 1 is in the jth slot. Since each Ha is a direct sum of the Hi’s, 
the direct integral of the Ha’s is the Hilbert space direct sum of the direct 
integral of the Hi’s, which is just L 2 (a(Aj), p). 

Meanwhile, H is the direct sum of the Wj’ s, and we have unitary maps 
Uj of Wj to L 2 (a(Aj), p) such that UjAUj is just multiplication by A on 
L 2 (Ej,p). Thus, we can assemble the Uj' s into a single unitary map U of H 
to the integral of the Ha’s, and we will have UAU -1 equal to multiplication 
by A, as desired. ■ 

In the interest of brevity, we will not give a complete proof of Proposi¬ 
tion 7.22 (uniqueness in Theorem 7.19), but only indicate the main ideas. 
To establish the equivalence of p W and p^ 2 \ we observe that both mea¬ 
sures have the same sets of measure zero as the projection-valued measure 
p A (Proposition 7.23). Meanwhile, if we have two different direct integrals, 
each unitarily equivalent to H as in (7.20), then there will be a unitary 
map V between the two direct integrals that commutes with the opera¬ 
tor s(A) i —y As(A). Using an argument similar to that in Exercise 7, we 
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can show that there must be bounded maps V\ : H ^' 1 such that 

(Vs)(X) = V\s(X) for almost every A. Then we argue that the only way 
V can be unitary is if V\ is unitary for almost every A. This implies that 
dimH^ = dim for almost every A. 

Finally, we briefly indicate the proof of the multiplication operator form 
of the spectral theorem. 

Proof of Theorem 7.20. Let W 3 be as in Lemma 8.13 and let A 3 be the 
restriction of A to Wj. By the proof of Theorem 7.19, each A 3 is unitarily 
equivalent to multiplication by A on the Hilbert space L 2 (a(Aj) : for 

some finite measure fij on a(Aj). Let X be the disjoint union of the sets 
cr(Aj). let )jl be the sum of the measures /ij. and let h be the function 
whose restriction to each a(Aj) is the function A A. Then L 2 (X,/j.) is 
the orthogonal direct sum of the Hilbert spaces L 2 (a(A 3 ), / ij ), which means 
that L 2 {X , n) may be identified unitarily with H = ®W 3 in an obvious way. 
Under this identification, the operator A corresponds to multiplication by h. 


8.3 Exercises 

1. (a) Suppose A,Bg B(H) commute and A is not invertible. Show 

that AB is not invertible. 

Hint: First show that if AB were invertible, then A would have 
both a left inverse and a right inverse. Then show that the left 
inverse and right inverse would need to be equal. 

(b) Show that the result of Part (a) is false if we omit the assumption 
that A and B commute. 

2. (a) Suppose A £ H(H) is self-adjoint and cr(A) C [0,oo). Show that 

A has a self-adjoint square root in £>(H) and therefore that A is 
a non-negative operator (i.e., {ip,Aip) > 0 for all ip € H). 

(b) Give an example of a bounded operator A on a Hilbert space 
such that a (A) C [0,oo) but A is not non-negative. 

3. Let X be a compact metric space and let C(X;M) denote the space 
of continuous real-valued functions on X. Suppose that T is a set of 
bounded, measurable, complex-valued functions on X with the fol¬ 
lowing properties: (1) T is a complex vector space, (2) T contains 
C(X;R), and (3) T is closed under pointwise limits of uniformly 
bounded sequences. (A sequence f n is uniformly bounded if there 
exists a constant C such that \f n (x)\ < C for all n and x). 

(a) Let £q denote the collection of those measurable sets E for which 
1 b is a uniformly bounded limit of a sequence of continuous 
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functions. Show that Cq is an algebra and contains all open sets 
in X. 

(b) Let C\ denote the collection of all measurable sets in E for 
which 1 e belongs to T. Using the monotone class lemma (The¬ 
orem A. 8), show that C\ consists of all Borel sets in X. 

(c) Show that T consists of all bounded, Borel-measurable functions 
on X. 

4. Suppose A G £>(H) is self-adjoint [i A and v A are two projection¬ 
valued measures on a {A) such that 



Show that integration with respect to fi A agrees with integration with 
respect to v A , first on polynomials, then on continuous functions, and 
finally on bounded measurable functions. Conclude that /r A = v A . 

Hint: Use Exercise 17. 

5. Suppose A G 0(H) is self-adjoint operator and V is a closed subspace 
of H that is invariant under A. 

(a) Using Proposition 7.7, show that the spectrum of the restriction 
to V of A is contained in the spectrum of A. 

(b) Suppose now that / is a bounded measurable function on cr(A), 
which means that / is also a function on a {A\ v ) C cr(A). Show 
that V is invariant under f(A) and that 


f(A)\ v = f(A\ v ), 


where the operator on the right-hand side is defined by the 
measurable functional calculus for the bounded self-adjoint op¬ 
erator A\y. 

6 . Suppose A G 0(H) is self-adjoint and ip is an eigenvector for A , that 
is, a nonzero vector with Aip = Xip for some A G R. Show that for 
any bounded measurable function / on cr(A) we have 


f{A)ip = 


Hint: Use Exercise 5. 

7. Suppose K C K is a compact set and n is a finite measure on K. Let 
A be the bounded operator on L 2 (K,n) given by 


{Aip)( A) = Xip(X). 


Now suppose that B is a bounded operator on L 2 (K,fX) that com¬ 
mutes with A. 
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(a) Let <fi = B1 , where 1 denotes the constant function, so that 
4> £ L 2 (K, p). Show that for all continuous functions i/> on K, 
we have Bi/i = <jn{). 

(b) Using Exercise 3, show that for all bounded, Borel-measurable 
functions ij> on AT, we have Bi)> = <A*/’- 

(c) Show that </> is essentially bounded (i.e., bounded outside a set of 
^-measure zero). Conclude that Bif) = (j)i/j for all if) £ L 2 (K,p). 

8 . If A £ B( H) is self-adjoint, define U(t) £ B( H) by U(t) = exp {it A} 
for each t £ R, where the exponential is defined by the functional 
calculus for A. 

(a) Show that U(t) is unitary for all t and that U(s)U(t) = U(s + 
t). (A family of operators with this property is called a one- 
parameter unitary group.) 

(b) Show that the map t U(t) is continuous in the operator norm 
topology. 

(c) Give an example of a one-parameter unitary group on a Hilbert 
space that is not continuous in the operator norm topology. 

See Sect. 10.2 for more on one-parameter unitary groups. 


9 

Unbounded Self-Adjoint Operators 


9.1 Introduction 

Recall that most of the operators of quantum mechanics, including those 
representing position, momentum, and energy, are not defined on the en¬ 
tirety of the relevant Hilbert space, but only on a dense subspace thereof. 
In the case of the position operator, for example, given ip £ L 2 (R), the 
function Xip(x) = xip(x) could easily fail to be in A 2 (R). Nevertheless, the 
space of ifi's in L 2 (R) for which xip(x) is again in A 2 (R) is a dense subspace 
of L 2 (R) . A closely related property of these operators is that they are not 
bounded, meaning that there is no constant C such that 

\\Aip\\<C\\ip\\ 

for all if} for which A is defined. Because our operators are unbounded, we 
cannot use the BLT (bounded linear transformation) theorem to extend 
them to the whole Hilbert space. 

In this chapter and the following one, we are going to study unbounded 
operators defined on dense subspaces of a Hilbert space H. We will in¬ 
troduce the “correct” notion of self-adjointness for unbounded operators, 
namely the one for which the spectral theorem holds. As it turns out, the 
obvious candidate for a definition of self-adjointness, namely that (<p, Aip) = 
(Acfi, ip) for all <p and ip in the domain of A, is not the correct one. Rather, 
for any unbounded operator A, we will define another unbounded operator 
A *, the adjoint of A , with its own naturally defined domain. Then A is 
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said to be self-adjoint if A* and A are the same operators with the same 
domain. 

In the present chapter, we give the definition of an unbounded self-adjoint 
operator, along with conditions for self-adjointness and several examples 
and counterexamples. We defer a discussion of the spectral theorem itself 
until Chap. 10. The statement of the spectral theorem (either in terms of 
projection-valued measures or in terms of direct integrals) is essentially the 
same as in the bounded case, with only a few modifications to deal with 
the domain of the operator. 

Although this chapter is rather technical, a reader who is willing to ac¬ 
cept some things on faith may wish simply to read the definitions of self- 
adjoint and essentially self-adjoint operators in Sect. 9.2, and then skip to 
the statements of Theorem 9.21 and Corollary 9.22 in Sect. 9.5. As in pre¬ 
vious chapters, H will denote a separable Hilbert space over C. 


9.2 Adjoint and Closure of an Unbounded 
Operator 

Recall that we briefly introduced unbounded operators in Sect. 3.2. Accord¬ 
ing to Definition 3.1, an unbounded operator A on H is a linear map of some 
dense subspace Dom(A) C H (the domain of A) into H. As in Sect. 3.2, 
“unbounded” means “not necessarily bounded,” meaning that we permit 
the case in which Dom(A) = H and A is bounded. 

Now, if A is bounded, then for any <j>, the linear functional 

is bounded. Thus, by the Riesz theorem (Theorem A.52), there is a unique 
y such that 

(<M') = ix,-) ■ 

We then define the adjoint A* of A by setting A*(p equal to y. (See 
Sect. A.4.) 

If A is unbounded, then (</>, A-) is not necessarily bounded, but may be 
bounded for certain vectors <p. If (0, A-) does happen to be bounded, for 
some <p £ H, then the BLT theorem (Theorem A.36) says that this linear 
functional has a unique bounded extension from Dom(A) to all H. The 
Riesz theorem then tells us that there is a unique y such that this linear 
functional is “inner product with y.” This line of reasoning leads to the 
following definition, which was already introduced briefly in Sect. 3.2. 

Definition 9.1 Suppose A is an operator defined on a dense subspace 
Dom(A) C H. Let Dom(A*) to be the space of all <f> £ H for which the 
linear functional 


if i->- (4>, Aip) , ip £ Dom(A), 
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is bounded. For <fi £ Dom(A*), define A*<f> to be the unique vector such that 
(<f,Aif) = (A*(f,if) for all if £ Dom(A). 

Saying that (<f>, A-) is bounded means, explicitly, that there exists a con¬ 
stant C such that \(cf, Aif)\ < C ||i/>|| for all if £ Dom(A). As in the bounded 
case, the operator A* is linear on its domain, and is called the adjoint of A. 

Another way to think about the definition of A* is as follows. Given 
a vector </>, if there exists a vector \ such that ( (f,Aif) = (x,if) for all 
if £ Dom(A), then cf belongs to Dom(A*) and A*cf = %. By the Riesz 
theorem, such a x will exist if and only if (</>, A-) is bounded, which means 
this way of thinking about A* is equivalent to Definition 9.1. 

Given a densely defined operator A, the adjoint A* of A could fail to 
be densely defined. This situation, however, is a pathology that does not 
usually occur for operators of interest in applications. 

Definition 9.2 An unbounded operator A on H is symmetric if 

(<t>, Aif) = (A<f, if) (9.1) 


for all (f,if £ Dom(A). 

As we will see shortly, if A is symmetric, then A* is an extension of A, 
in the sense of the following definition. 

Definition 9.3 An unbounded operator A is an extension of an unbounded 
operator B i/Dom(A) D Dom(i3) and A = B on Dom(B). 

If A is an extension of B , then very likely A is given by the same “for¬ 
mula” as B. If H = L 2 (R), for example, both operators might be given 
by the formula — ih d/dx on their respective domains. Nevertheless, if 
Dom(A) ^ Dom(B), then A is still a different operator from B. 

Proposition 9.4 An unbounded operator A is symmetric if and only if A* 
is an extension of A. 

Proof. If A is symmetric, then for all </> £ Dom(A), (9.1) and the Cauchy- 
Schwarz inequality show that 

\{(f,Aif)\ < \\A(f\\ \\if \\, 

showing that (f £ Dom(A*). In that case, (9.1) shows that the unique vector 
A*cf for which (cf, Aif) = ( A*cf , if) is nothing but A(f , which means that A* 
agrees with A on Dom(A). 

In the other direction, if A* is an extension of A, then for each cf £ 
Dom(A), we have 


(f>,Aif) = ( A*cf,if) = (Acf,if ), 


for all if £ Dom(A), which shows that A is symmetric. ■ 
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We come now to the key definition of this section, that of self-adjointness. 
This notion constitutes the hypothesis of the spectral theorem for un¬ 
bounded operators. 

Definition 9.5 An unbounded operator A on H is self-adjoint if 

Dom(A*) = Dom(A) 
and A*<j) = A(f> for all cp G Dom(A). 

We may reformulate the definition of self-adjointness by saying that A 
is self-adjoint if A* is equal to A, provided that equality of unbounded 
operators is understood to include equality of domains. Every self-adjoint 
operator is symmetric (by Proposition 9.4), but there exist many operators 
that are symmetric without being self-adjoint. In light of Proposition 9.4, 
a symmetric operator is self-adjoint if and only if Dom(A*) = Dom(A). In 
trying to show that a symmetric operator is self-adjoint, the difficulty lies 
in showing that Dom(A*) is no bigger than Dom(A). 

Definition 9.6 An unbounded operator A on H is said to be closed if the 
graph of A is a closed subset o/H x H. An unbounded operator A on H is 
said to be closable if the closure in H x H of the graph of A is the graph of 
a function. If A is closable, then the closure A cl of A is the operator with 
graph equal to the closure of the graph of A. 

To be more explicit, an operator A is closed if and only if the following 
condition holds: Suppose a sequence ip n belongs to Dom(A) and suppose 
that there exist vectors ip and <p in H with ip n ^ ip and Aip n —> <p. Then 
ip belongs to Dom(A) and Aip = <p. Regarding closability, an operator A is 
not closable if there exist two elements in the closure of the graph of A of 
the form {cp, ip) and (cp, x), with ip \. Another way of putting it is to say 
that an operator A is closable if there exists some closed extension of it, in 
which case the closure of A is the smallest closed extension of A. 

The notion of the closure of a (closable) operator is useful because it 
sweeps away some of the arbitrariness in the choice of a domain of an 
operator. If we consider, for example, the operator A = —ih d/dx as an 
unbounded operator on L 2 (R), there are many different reasonable choices 
for Dom(A), including (1) the space of C°° functions of compact support, 
(2) the Schwartz space (Definition A.15), and (3) the space of continuously 
differentiable functions ip for which both ip and ip' belong to L 2 (R). As it 
turns out, each of these three choices for Dom(A) leads to the same operator 
A cl . Note that we are not claiming that every choice for Dom(A) leads to 
the same closure; nevertheless, it is often the case that many reasonable 
choices do lead to the same closure. 

Definition 9.7 An unbounded operator A on H is said to be essentially 
self-adjoint if A is symmetric and closable and A cl is self-adjoint. 
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Actually, as we shall see in the next section, a symmetric operator is 
always closable. Many symmetric operators fail to be even essentially self- 
adjoint. We will see examples of such operators in Sects. 9.6 and 9.10. Sec¬ 
tion 9.5 gives some reasonably simple criteria for determining when a sym¬ 
metric operator is essentially self-adjoint. 


9.3 Elementary Properties of Adjoints and Closed 
Operators 

In this section, we spell out some of the most basic and useful properties 
of adjoints and closures of unbounded operators. In Sect. 9.5, we will draw 
on these results to prove some more substantial results. In what follows, 
if we say that two operators “coincide,” it means that they have the same 
domain and that they are equal on that common domain. 

Proposition 9.8 1. If A is an unbounded operator on H, then the 

graph of the operator A* (which may or may not be densely defined) 
is closed in H x H. 

2. A symmetric operator is always closable. 

Proof. Suppose ijj n is a sequence in the domain of A* that converges to 
some ip £ H. Suppose also that A*ip n converges to some <p £ H. Then 
(ip n ,A-) = (A*ip n , •) and for any \ G Dom(A), we have 

( ip,Ax)= lim (ip n ,Ax)= lim (A*ip n , x ) = ( 3 , x) ■ 

n—too n—>oo 

This shows that ip belongs to the domain of A* and that A* ip = <p, estab¬ 
lishing that the graph of A* is closed. 

If A is symmetric, A* is an extension of A. Since, as we have just proved, 
A* is closed, A has a closed extension and is therefore closable. ■ 

Corollary 9.9 If A is a symmetric operator with Dom(A) = H, then A is 
bounded. 

Proof. Since A is symmetric, it is closable by Proposition 9.8. But since 
the domain of A is already all of H, the closure of A must coincide with 
A itself. (The closure of A always agrees with A on Dom(A), which in this 
case is all of H.) Thus, A is a closed operator defined on all of H, and the 
closed graph theorem (Theorem A.39) implies that A is bounded. ■ 

Proposition 9.10 If A is a closable operator on H, then the adjoint of 
A cl coincides with the adjoint of A. 

Proof. Suppose that for some ip £ H there exists a cp such that (ip, A cl x ) = 
(4>,x) for all X G Dom(A cZ ). Since A cl is an extension of A, it follows 
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that (ip, Ax) = (<p,x) f° r all X G Dom(A). This shows that Dom(A*) D 
Dom((A ci )*) and that A* agrees with ( A cl )* on Dom((A cZ )*). 

In the other direction, suppose for some ip £ H there exists a <p such 

that (ip, Ax) = (<p, x) f° r all X G Dom(A). Suppose now £ £ Dom(A cZ ) with 
A °i£ 

= r /. Then there exists a sequence Xn in Dom(A) with \n —•► £ and 
Axn —■> Vi and we have 

(4’,AXn) = (<P,Xn) 

for all n. Letting n tend to infinity, we obtain (ip, r) = (<p, £), or (ip, A cl £) = 
((p,£). This shows that ip £ Dom((A ci )*) and A cl ip = (p. Thus, Dom(A*) C 
Dom((A d )*). ■ 

Proposition 9.11 If A is essentially self-adjoint, then A cl is the unique 
self-adjoint extension of A. 

Proof. Suppose B is a self-adjoint extension of A. Since B = B*, B is closed 
and is, therefore, an extension of A cl . It then follows from the definition of 
the adjoint that Dom(U*) C Dom(A cZ ). Thus, we have 

Dom(U*) c Dom(A ci ) C Dom(S). 

Since B is self-adjoint, all three of the above sets must be equal, so actually 
B = A cl . m 

Proposition 9.12 If A is an unbounded operator on H, then 

(Range(A)) _L = ker(A*). 

Proof. First assume that ip £ (Range(A))- 1 -. Then for all cp £ Dom(A) we 
have 

(ip, Acp) = 0. 

That is to say, the linear functional (ip, A) is bounded—in fact, zero— 
on Dom(A). Thus, from the definition of the adjoint, we conclude that 
ip £ Dom(A*) and A*ip = 0. 

Meanwhile, suppose that ip is in Dom(A*) and that A*ip = 0. The only 
way this can happen is if the linear functional (ip, A) is zero on Dom(A), 
which means that ip is orthogonal to the image of A. m 

Proposition 9.13 Suppose A is an unbounded operator on H and that B 
is a bounded operator defined on all of H. Let A + B denote the operator 
with Dom(A + B) = Dom(A) and given by (A + B)ip = Aip + Bp’ for all 
ip £ Dom(A). Then (A + B)* has the same domain as A* and (A + B)*ip = 
A*ip + B*ip for all ip £ Dom(A*). 

In particular, the sum of an unbounded self-adjoint operator and a 
bounded self-adjoint operator (defined on all of H) is self-adjoint on the 
domain of the unbounded operator. 
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Proof. See Exercise 3. ■ 

The sum of two unbounded self-adjoint operators is not, in general, self- 
adjoint. See Sect. 9.9 for more information about this issue. 

Proposition 9.14 Let A be a closed operator and X an element of C. 
Suppose that there exists e > 0 such that 


\\(A-XI)^\\>eM (9.2) 

for all A in Dom(A). Then the range of A — XI is a closed subspace of H. 

Here, we take the domain of the operator A — XI to coincide with the 
domain of A, as in Proposition 9.13. 

Proof. Assume that (p n is a sequence in the range of A — XI converging 
to some (p. Then <p n = {A — XI)ip n , for some sequence tp n in Dom(A). Ap¬ 
plying (9.2) with ip = ip n -ip m shows that \\ip n ~ 4>m || < (1 /e) \\(p n - <t> m II- 
This means that ip n is Cauchy and thus convergent to some vector ip. Since 
ip n -5► ip and (A — XI)ip n = <p n <p, we have that 

Alpn = Xlpn + (pn ^ Xlp + (p- 

Thus, by the definition of a closed operator, ip £ Dom(A) and Aip = Xip+cp. 
This means that (A — XI)ip = <p and so the range of A — XI is closed. ■ 
We conclude this section with a simple example for which we can compute 
the adjoint and closure explicitly. 

Example 9.15 Let (ej) be an orthonormal basis for H and let (A f) be 
an arbitrary sequence of real numbers. Define an operator A on H with 
Dom(A) equal to the space of finite linear combinations of the ej ’s, with A 
itself defined by 

Aej = Xj e.j. 

Then A is symmetric and closable and Dom(A*) = Dom(A c! ) = V, where 


V = 


3 


5Z(i + ^1) ki 2 


< oo 


For any ip = JT a j e j * n ^ we have 


A*ip = A cl ip = ^ ajXjej. 
j 


(9.3) 


(9.4) 


Thus, ( A cl )* = A* = A cl , showing that A is essentially self-adjoint. 

Proof. Note that for any sequence (a,) of coefficients satisfying the condi¬ 
tion on the right-hand side of (9.3), we have \aj \ 2 < oo and, thus, the 
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sum Y^j a j e j converges in H. Suppose first that (p = Y2j a i e o belongs V. 
Then for any ip = y~U bjej (finite sum) in the domain of A we have 

{<t>, Ail’) = w i X i b i 

3 

and so by the Cauchy-Schwarz inequality, 


I (‘MV’)| < 



Thus, (4>,A-) is a bounded linear functional, showing that (j) £ Dom(A*). 
Furthermore, it is apparent that (</>, Aip) = (x, V’} for all 4’ £ Dom(A), 
where x = Y2j a j x j e j- 

Meanwhile, suppose <p = Y2j a j e j belongs to the domain of A*, and 
consider ipN := x j a j e j i n Dom(A). Then 


N 


\{<p,Aip N )\ = l°il 2 
i=i 



1/2 

IhM- 


Since cp £ Dom(H*), the functional (cp, A-) is bounded, and so X j l a /| 2 

must be bounded, independent of N, and so Yhj X j l a jl < Since cp 

belongs to H, we have also that )TU | 2 < oo, showing that cp is in V. 

Turning now to the closure of A, it is apparent that A is symmetric and 
thus closable, by Proposition 9.8. Suppose ip = a j e j belongs to V and 

consider ipN := a j e j- Clearly, ipN converges to ip. Furthermore, since 

ip £ V, we see that Aipu converges to the vector Y2j a j x j e j- This shows 
that ip £ Dom(A cZ ) and that A cl ip = Y2j a j x j e j- Thus, each element of V 
belongs to Dom(A cZ ) and A cl is given on V by (9.4). 

Now, the space V forms a Hilbert space with respect to the norm given 
by 

Mv = D 1 + A ?)l°il 2 ’ 

3 

where ip = Y2j a j e j- [To establish completeness of V with respect to this 
norm, note that V can be identified isometrically with L 2 (N) with respect 
to the measure /i for which p,({j}) = 1 + A 2 .] Suppose, now, that we have a 
sequence ( ip m ) in Dom(H) for which both {ipm) and ( Aip m ) are convergent. 
Then ( ip m ) forms a Cauchy sequence in V which converges to some element 
ip of V. Since ||V'|| H < \\ip\\ v for all ^ £ Dom(A), we see that ip m also 
converges in H to ip £ V- This shows that each element of Dom(A ci ) 
belongs to V. m 
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9.4 The Spectrum of an Unbounded Operator 

Recall that if A is a bounded operator, then a number A £ C belongs to 
the resolvent set of A if the operator A — XI has a bounded inverse, and A 
belongs to the spectrum of A if A — XI does not have a bounded inverse. 
For an unbounded operator A , we will say that a number A G C is in the 
resolvent set of A if A — XI has a bounded inverse. That is, even though 
A is unbounded, for A to be in the resolvent set of A, there must be a 
bounded inverse to A — XI; otherwise, A is in the spectrum of A. We make 
this characterization more precise in the following definition. 

Definition 9.16 Suppose A is an unbounded operator on H. A number 
AeC belongs to the resolvent set of A if there exists a bounded operator 
B with the following properties: (1) For all if £ H, Bif belongs to Dom(A) 
and (A—XI) Bif = if, and (2) for all if £ Dom(A) we have B(A—XI)if = if. 
If no such bounded operator B exists, then X belongs to the spectrum of A. 

Note that we are implicitly taking Dom(A — XI) to equal Dorn (A), as in 
Proposition 9.13. As in the bounded case, even if A is self-adjoint, points 
A in the spectrum of A are not necessarily eigenvalues; that is, there does 
not necessarily exist a nonzero if £ Dom(A) with Aif = Xif. On the other 
hand, if Aif = Xif for some if £ Dom(A), then A — XI is not injective and 
thus A certainly does belong to the spectrum of A. 

Theorem 9.17 If A is an unbounded self-adjoint operator on H, the spec¬ 
trum of A is contained in the real line. 

If A is symmetric but not self-adjoint, then the spectrum of A must 
contain points not in the real line. Indeed, Theorem 9.21 will show that at 
least one of (A — il) and (A + il) must fail to be surjective, and thus at 
least one of the numbers i and —i is in the spectrum of A. Nevertheless, a 
symmetric operator cannot have nonreal eigenvalues, as we showed already 
in Proposition 3.4. 

Proof. Consider a complex number A = a + ib with b ^ 0. Since A is 
symmetric, the proof of Lemma 7.8 applies, giving 

((A-XI)if,(A-XI)if)>b 2 (if,if) (9.5) 

for all if £ Dom(A). This shows that (A — XI) is injective. 

Meanwhile, applying Propositions 9.12 and 9.13 with B = —XI we see 
that 

(Range(A — XI)) ± = ker((A — XI)*) = ker(A* — XI) = ker(A — XI). 

Since A again has nonzero imaginary part, A — XI is also injective, showing 
that Range (A — XI) is dense in H. Since A = A* is closed, (9.5) allows us 
to apply Proposition 9.14 to show that Range(A — XI) is closed, hence all 
of H. 
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We have shown, then, that {A — A I) maps Dom(A) injectively onto H. It 
follows from (9.5) (or the closed graph theorem) that the inverse operator 
is bounded, so that A is in the resolvent set of A. m 

Our next result shows that the spectrum of an unbounded self-adjoint 
operator has properties similar to that of a bounded self-adjoint operator. 

Proposition 9.18 If A is an unbounded self-adjoint operator on H, then 
the following hold. 

1. A number A £ K belongs to the spectrum of A if and only if there 
exists a sequence ip n of nonzero vectors in Dom(A) such that 


lim 

n—¥ oo 


\\{A-XI)M 

Un\\ 


= 0 . 


(9.6) 


2. The spectrum cr(A) of A is a closed subset of M. 

Although the spectrum of a bounded self-adjoint operator is a bounded 
subset of M, the spectrum of an unbounded self-adjoint operator will be 
unbounded. Indeed, it can be shown (using the spectral theorem) that if 
a self-adjoint operator has bounded spectrum, then the operator must be 
bounded. 

Proof. For Point 1, if a sequence as in (9.6) existed, then as in the proof 
of Proposition 7.7, A — XI could not have a bounded inverse, so A must be 
in the spectrum of A. Conversely, suppose no such sequence exists. Then 
there is some e > 0 such that 

\\{A- XI)ip\\ > e\\ip\\ (9-7) 

for all ip e Dom(A). This means that A — XI is injective and that, by 
Proposition 9.14, the range of A — XI is closed. But 

(A - XI)* = A* -XI = A- XI 

and A — XI is injective, so by Proposition 9.12, the range of A — XI is all 
of H. This means A — XI has an inverse, which is bounded by (9.7). Thus 
A is not in the spectrum of A. 

Point 2 is left as an exercise (Exercise 4). ■ 

Definition 9.19 Let A be an unbounded operator on H. Then A is non¬ 
negative if {ip, Aip) > 0 for all ip £ Dom(A) and A is bounded below by 
cell/ {ip, Aip) > c \\ip\\ 2 for all ip £ Dom(A). 

Proposition 9.20 Let A be an unbounded self-adjoint operator on H. If 
A is non-negative, then the spectrum of A is contained in [0,oo). More 
generally, if A is bounded below by c, then the spectrum of A is contained 
in [c , oo). 
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We will eventually see, using the spectral theorem for unbounded self- 
adjoint operators, that the converse to Proposition 9.20 also holds: If the 
spectrum of a self-adjoint operator A is contained in [0, oo), then A is non¬ 
negative, and if the spectrum of A is contained in [c, oo), then A is bounded 
below by c. These results follow easily, for example, from the form of the 
spectral theorem in Theorem 10.9. 

Proof. Suppose A is bounded below by c and A is a point in the spectrum 
of A. If ijj n be a sequence as in Point 1 of Proposition 9.18, with the ipn s 
normalized to be unit vectors, then 

lim \(ip n , (A - \I)i) n )\ < lim \\(A ~ XI)ip n \\ = 0. 

n—>■ oo n—too 

On the other hand, A = XI + (A — XI), and so 

(VVn AA) = A + {4> n , ( A - XI)lp n ) . 

Thus, ( ip n , Aip n ) converges to A (= A ( ip n , ip n )) as n tends to infinity. Since 
A is bounded below by c, we must have A > c. This establishes the result 
for operators bounded below by c. Specializing to c = 0 gives the result for 
non-negative operators. ■ 


9.5 Conditions for Self-Adjointness and Essential 
Self-Adjointness 

In this section, we give criteria for determining whether a symmetric oper¬ 
ator is self-adjoint or essentially self-adjoint. See also Sect. 10.2 for the con¬ 
nection between self-adjoint operators and one-parameter unitary groups. 

Theorem 9.21 If A is a symmetric operator on H, then A is essentially 
self-adjoint if and only if Range(A — il) and Range(A + il) are dense 
subspaces of H . 

Using Proposition 9.12, we can reformulate this result as follows. 

Corollary 9.22 If A is a symmetric operator on H, then A is essentially 
self-adjoint if and only if the operators A* + il and A* — il are injective 
on Dom(A*)- 

As Exercise 11 shows, it is possible to have one of the operators A* + il 
and A* — il be injective and the other fail to be injective. 

Proof of Theorem 9.21. Assume first that A is essentially self-adjoint, 
so that A cl is self-adjoint. Then A* = ( A cl )* = A cl , and so 

[Range(A - il)} 1 - = ker(A* + il) = ker (A cl + il) = {0}, 

by Theorem 9.17, and similarly for the range of A + il. 
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Conversely, assume A is symmetric and that A — il and A + il both 
have dense range. Since ( A cl )* = A* is a closed extension of A, it is also 
an extension of A cl , showing that A cl is symmetric. We may then apply 
Lemma 7.8 —the proof of which requires only symmetry—to the operator 
A cl with A = i, giving 

|| (A cl — i/)'0 || 2 > Ill'll 2 (9.8) 

and showing that A cl — il is injective. Since the range of A — il is dense, 
the range of A cl — il is certainly also dense. But since A cl is closed, (9.8) 
and Proposition 9.14 tell us that the range of A cl — il is closed, hence all 
of H. Similar reasoning shows that the range of A cl + il is also all of H. 

Now, by Proposition 9.13, {A 01 — il)* = (A cl )* + il, which is an extension 
of A cl + il. Suppose ( A cl )* + il is a proper extension of A cl + il , that is, 
that the domain of (A cl )* +il is strictly bigger than the domain of A cl +il. 
Then since A cl + il already maps onto H, ( A cl )* + il cannot be injective. 
Thus, the operator 

(. A cl )* + il = A*+il = (A- il)* 

must have a nontrivial kernel. Then by Proposition 9.12, Range(A — il) is 
not dense, contradicting our assumptions. 

We conclude, therefore, that ( A cl )* + il is not a proper extension of 
A cl + il, i.e., that ( A cl )* + il = A cl + il (with equality of domains). This, 
by Proposition 9.13, means that ( A cl )* = A* (with equality of domains), 
which is what we are trying to prove. ■ 

Proposition 9.23 If A is a symmetric operator on H, then A is self- 
adjoint if and only if 

Range(A — il) = Range(A + il) = H. 

Proof. Suppose first that A is self-adjoint. Then by Theorem 9.21, the 
ranges of A — il and A + il are dense in H. On the other hand, 

||(A - il)if\\ 2 > ||^|| 2 , (9.9) 

by (the proof of) Lemma 7.8, with X = i. Since, also, A = A* is closed, 
Proposition 9.14 tells us that the range of A — il is closed, hence all of H. 
A similar argument shows that the range of A + il is all of H. 

Conversely, suppose that the ranges of A — il and A + il are all of H. 
Then A is essentially self-adjoint by Theorem 9.21, so that A* is self-adjoint. 
Since A — il already maps onto H, if A* were a nontrivial extension of A, 
then A* — il could not be injective. But (9.9), with A replaced by A*, shows 
that A* — il is injective. Thus, A = A* and so A is self-adjoint. ■ 

In the case that A is positive-semidefinite (i.e., (ijj,Ail)) > 0 for all if £ 
Dom(A)), there is another self-adjointness condition, the proof of which is 
very similar to that of Theorem 9.22. 
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Theorem 9.24 Suppose that A is a symmetric operator on H and that 
{if), Aip) > 0 for all ip £ Dom(A). Then A is essentially self-adjoint if and 
only if A + I has dense range. Equivalently, A is essentially self-adjoint if 
and only if A* + I is injective. 

Proof. Assume first that A is essentially self-adjoint. Then (A + /)* = 
A* + I = A cl + I. It is easily seen that A cl is also positive definite, and so 

(ip, ( A cl + I)ip) = {ip, ip) + (ip, A cl ip) > (ip, ip) (9.10) 

Thus, A cl +I = {A + I)* is injective. Thus, the range of A +1 is dense, by 
Proposition 9.12. 

Now assume that A+I has dense range. By (9.10), A cl +I is injective and 
by (9.10) and Proposition 9.14, the range of A cl + I is closed, hence all of H. 
Assume Dom(A*) is strictly larger than Dom(A cZ ). Then because A cl +1 is 
already surjective, A* +1 (which has a domain equal to the domain of A*) 
cannot be injective. Thus, A* +1 = {A + 1)* has a nontrivial kernel, which 
means that the range of A + I is not dense. This is a contradiction, and 
so the domain of A* must actually be equal to the domain of A cl . Since A 
and so also A cl are symmetric, this means that A cl is self-adjoint. ■ 

Example 9.25 Suppose that A is a symmetric operator on H that has 
an orthonormal basis of eigenvectors. That is to say, suppose there is an 
orthonormal basis {e^} for H such that for each j, we have ej £ Dom(A) 
and Aej = A jej for some real number Xj. Then A is essentially self-adjoint. 

This result is a strengthening of Example 9.15, in that we do not assume 
that the domain of A is equal to the space of finite linear combinations of 
the ej’s. 

Proof. For any j, (A — il)ej = (A j — i)ej. Since Xj is real, we have a 
nonzero multiple of ej belonging to Range(A — il), for each j. This shows 
that Range(A — il ) is dense, and similarly for Range(A + il). ■ 

Example 9.26 Suppose H is a Hilbert space direct sum of a sequence of 
separable Hilbert spaces IT,-: 

OO 

i= i 

Suppose also that Aj is a bounded self-adjoint operator on H j, for each j. 
Define a subspace V of H by 


V = lip = (lpl,1p 2 ,---) 


3 = 1 


(llVbll-+ 11^^11-) 


< OO 


Suppose now that A is a symmetric operator on H whose domain contains 
the finite direct sum of the Hj ’s and such that A| H . = Aj. Then A is 
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essentially self-adjoint, Dom(A ci ) = Dom(A*) = V, and 

A cl iP = A*iP = (A 1 ip 1 ,A 2 fa,...) (9.11) 

for all if = (ifi,if 2 , • • •) inV. 

See Definition A.45 for the definition of the Hilbert direct sum and the 
finite direct sum of a sequence of Hilbert spaces. Example 9.25 is the special 
case of Example 9.26 in which each H, has dimension 1. This result will 
be useful to us in Chap. 10. 

Proof. Since Aj is self-adjoint, the ranges of Aj — il and Aj + il are 
dense in Hj. Thus, the closure of the range of A — il contains each Hy 
and is therefore dense in H, and similarly for A + il. This shows that A is 
essentially self-adjoint. 

It remains to show that the domain of A* = A cl is V. Let W denote the 
finite direct sum of the Hy’s. By the argument in the previous paragraph, 
A\ w is essentially self-adjoint. Then A* is a symmetric extension of ( A\ w )*, 
which must coincide with (A\ w )*. Thus, it suffices to consider the case 
Dom(A) = W. 

If we assume that Dom(A) = W, we can compute the adjoint of A by the 
argument in Example 9.15. If (f £ V, then the Cauchy-Schwarz inequality 
shows that the linear functional ( <f,Aj is bounded and that A*cj> is as 
(9.11). On the other hand, if (cf,A-) is bounded, where <f = {<f\, (f 2 , . ..), 
take 

4>n = (<j>i, <t> 2 , ■ ■ ■, 4>n, o, o,...). 

Then, as in the proof of Example 9.15, the only way we can have | (<j), Atf jv) | < 
C H^jvjl is if (j> belongs to V. ■ 


9.6 A Counterexample 

In this section, we will examine an elementary example of an operator that 
is symmetric but not essentially self-adjoint. Our example will be essen¬ 
tially the momentum operator on a finite interval, with “wrong” boundary 
conditions. (A more sophisticated example is given in Sect. 9.10.) We take 
our Hilbert space to be L 2 ([0,1]). 

Proposition 9.27 Let Dom(A) C L 2 ([0,1]) be the space of continuously 
differentiable functions f on [0,1] satisfying 

V’(0)=P(1)=0. 


For if € Dom(A), define 
Then A is symmetric but not essentially self-adjoint. 


Af = -iA 
ax 
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We can understand the failure of essential self-adjointness of A in prac¬ 
tical terms as a failure of the spectral theorem. The eigenvector equation 
Aip = Aip for A € K is a first-order ordinary differential equation, whose 
general solution is ip(x) = ce zXx , where c is a constant. The only way such a 
function can satisfy the boundary conditions ^(O) = ip(l) = 0 is if c = 0, in 
which case ip is the zero vector. Thus, A has no eigenvectors. Furthermore, 
taking the closure of A does not help, because, as the proof will show, the 
boundary conditions survive taking the closure. 

Proof of symmetry. Using integration by parts we see that for all (p and 
ip in Dom(A) we have 



Since we assume (p and ip are in Dom( A ), the boundary terms are zero and 
we get 



Because there is a conjugate in one side of the inner product but not the 
other, it follows that 



as claimed. ■ 

We now consider A cl and A* = (A cl )*. We will see that there are elements 


of the domain of the adjoint that are not in the domain of the closure. 

Lemma 9.28 If (p is a continuously differentiable function on [0,1], then 
(p £ Dom(A*) and A*cp = —ih d<p/dx. 

Proof. If <p is continuously differentiable, then for any ip in Dom(A), we 
may integrate by parts as in (9.12). Since ip is zero at both ends of the 
interval, the boundary terms vanish and we obtain 



(9.13) 


Since dcp/dx is continuous and hence in L 2 ([0,1]), we see that (9.13) is a 
continuous linear functional, as a function of ip with fixed <p . Thus, ip is in 
the domain of A *, and A*cp = —i dcp/dx. m 

Proof of Proposition 9.27. Suppose ip is in the domain of A cl . Then 
there exist ip n in Dom(A) such that ip n converges to ip and Aip n converges 
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to some x £ £ 2 ([0,1]). Since the derivatives of the ip n ’s are converging in 
L 2 , the tpnS themselves must be converging uniformly, as can be shown by 
writing each ip n as the integral of its derivative. (See Exercise 10.) It follows 
that every clement of Dom(A d ) is continuous and vanishes at both ends of 
the interval. On the other hand, Dom(A*) contains all smooth functions, 
including many that do not vanish at the ends of the interval. Thus, A cl 
and ( A cl )* = A* do not have the same domains. ■ 

It follows from Lemma 9.28 that every complex number A belongs to the 
spectrum of A cl . See Exercise 9. 

The reason that A fails to be essentially self-adjoint is that we impose too 
many boundary conditions on functions in the domain of A, which results 
in there being too few boundary conditions (in this case, no boundary 
conditions at all) on functions in the domain of A*. In this example, A* is 
given by the same formula as A (— id/dx in both cases), but the domain of 
A* is bigger than the domain of A cl . 

Suppose we define another operator B 1 still given by the formula —i d/dx , 
but with the domain of B to be the space of continuously differentiable 
functions if with ip{ 0) = ip( 1). If we integrate by parts as in (9.12), the 
boundary terms will cancel, showing that B is symmetric. Meanwhile, the 
functions ip n {x ) := e 2mnx , n £ Z, form an orthonormal basis for L 2 ([0,1]) 
consisting of eigenvectors for B , with real eigenvalues X n = 27m. Thus, by 
Example 9.25, B is essentially self-adjoint. 


9.7 An Example 

We now give an example of an operator that is essentially self-adjoint. Let 
C£°(R) denote the space of smooth, compactly supported functions on R. 

Proposition 9.29 Let P be the densely defined operator with Dom(P) = 
C^R) C L 2 (R) and given by Pip = —ih dip/dx. Then P is essentially 
self-adjoint. 

Proof. Our strategy is to apply Corollary 9.22. Since P is symmetric, we 
expect that P* will be given by the formula —ih d/dx , on some suitable 
domain inside L 2 (R). Thus, if if £ ker(P* + il ), this should mean that 
—ih dip/dx = —iif, or dip/dx — (1 /H)ip(x), which ought to imply that 
ip(x) = ce x / n , for some constant c. Since ce x ^ n belongs to P 2 (R) only if 
c = 0, we hope to conclude that ip = 0. 

To say that ip £ L 2 (R) belongs to the kernel of P* + il means that ip 
belongs to Dom(P*) and that P*ip = —iip. This holds if and only if 
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for all x S Cg°(R). For any f £ C~(R), if we take x{x) = f{x)e~ x / h and 
combine the integrals into one, we get 



x/h H{x) 


ip(x) dx 


(9.14) 


Now, (9.14) says that the derivative of e~ x ^ h ip(x) in the weak or distribu¬ 
tional sense is zero. (See Proposition A.29 in Appendix A.3.3.) Thus, by the 
remarks immediately following Proposition A.5, we must have e~ x ! h ip{x ) = 
c for some c, meaning that ip(x) = ce x ! h . Since we also assume that ip be¬ 
longs to Dom(P*) C L 2 (R), we must have c = 0, so that ip is the zero 
element of L 2 (R). 

We have shown, then, that only 0 belongs to the kernel of P* + il. A 
similar argument with i replaced by — i and e x ! n by e~ x / n shows that only 
0 belongs to the kernel of P* — il. Thus, by Corollary 9.22, P is essentially 
self-adjoint. ■ 


9.8 The Basic Operators of Quantum Mechanics 

In this section, we consider several of the unbounded self-adjoint operators 
that arise in quantum mechanics. We find natural domains of self- ad¬ 
jointness for the position, momentum, kinetic energy, and potential energy 
operators. Since Schrodinger operators are more complicated to analyze, 
we postpone a discussion of them until the next section. We begin with the 
potential energy operator. 

Proposition 9.30 Suppose V : R" — > R is a measurable function. Let 
V (X) be the unbounded operator with domain 

Dom(F(X)) = {p £ L 2 (M n ) |V(x)^(x) £ L 2 (R n ) } 

and given by 

T(X)V’](x) = F(x)V'(x). 

Then Dom(F(X)) is dense in L 2 (R") and P(X) is self-adjoint on this 
domain. 

Proof. Define a subset E m of 8™ by 

E m = {x £ R" ||P(x)| <m}, 

so that U mE m = R n . Then for any ip £ L 2 (R”), the function iplE m belongs 
to Dom(F(X)). On the other hand, using dominated convergence, we have 
Mb™ —> ip as m —;► oo, establishing that Dom(V(X)) is dense. 
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Since V is real-valued, it is easy to see that V(X) is symmetric on 
Dom(U(X)). Thus, V(X)* is an extension of V(X). 

Meanwhile, suppose <f £ Dom(V(X)*), meaning that 


if i-A f <f(x)V(x)if(x) dx, ^£Dom(U(X)) (9.15) 

J x 


is a bounded linear functional. This linear functional has a unique bounded 
extension to L 2 and, thus, Thus, there exists a unique x G L 2 (M. n ) such 


that 


if(x)V(x)<f(x) dx= \(a;)((>(x) dx, 


IX 


(9.16) 


or 


/ ip{x)V{x) - x{x) 


4>{x) dx = 0 


J x L J 

for all f £ Dom(U(X)). 

Taking <fi = (ifV — x)l E m , we see that i[)V — x is zero almost everywhere 
on E m , for all m, hence zero almost everywhere on K". Thus, ipV is equal 
to x as an element of L 2 (R”). This shows that V 7 € Dom(V(X)). Thus, 
actually, Dom(U(X)*) = Dom(U(X)). Since we have already shown that 
V(X)* is an extension of V(X), we conclude that V(X) is self-adjoint on 
Dom(U(X)). ■ 

If we specialize the preceding proposition to the case V (x) = Xj , we 
obtain the following result about the position operator. 


Corollary 9.31 The position operator Xj is self-adjoint on the domain 

Dom(Xj-) = {if G L 2 (K”) \ xjipfx.) £ L 2 (R n ) } . 

We now turn to consideration of the momentum operator. Since the 
Fourier transform converts d/dxj into multiplication by ikj (Proposition 
A. 17) we can use the preceding results on multiplication operators to obtain 
a natural domain on which the momentum operator is self-adjoint. 


Proposition 9.32 For each j = 1,2, ...,n, define a domain Dom(P ? ) C 
L 2 (K") as follows: 


Dom(P :/ ) = 


£ L 2 (R n ) kjif k) £ L 2 



where if is the Fourier transform of if. Define Pj on this domain by 

Pjif = p-^hkjifi k)). 

Then Pj is self-adjoint on Dom(Pj). 

The domain Dom(P ) - ) of Pj can also be described as the set of all if £ 
L 2 (R”) such that dif/dxj, computed in the distribution sense, belongs to 
L 2 (R”). For any if £ Dom(Pj), we have Pjif = —ihdif/dxj, where dif/dxj 
is computed in the distribution sense. 
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Saying that the distributional derivative of ip belongs to L 2 (R") means 
(Proposition A.29) that there exists a (unique) <p in L 2 (R n ) such that 



for all x G C“(K"). If ip is continuously differentiable, then the distribu¬ 
tional derivative of ip coincides with the ordinary derivative of ip. Thus, if 
ip e L 2 (R ra ) is continuously differentiable, then ip belongs to Dom(Pj) if 
and only if dip/dxj , computed in the pointwise sense, belongs to L 2 (R n ), 
in which case Pjip = —ihdip/dxj. On the other hand, if ip £ Dom (Pj), it is 
not necessarily the case that ip is continuously differentiable. 

In the case n = 1, the domain of Pi certainly contains C^°(M.), since each 
element ip of C^R) is a Schwartz function (Definition A.15), so that ip 
is also a Schwartz function, in which case kip(k) belongs to L 2 (R). Now, 
as shown in Sect. 9.7, the operator —ihd/dx is essentially self-adjoint on 
C^R), which means that this operator has a unique self-adjoint extension. 
This self-adjoint extension must, therefore, agree with the operator Pi in 
the n = 1 case of Proposition 9.32. 

Lemma 9.33 Suppose ip £ L 2 (R ra ) has the property that dip/dxj, com¬ 
puted in the distribution sense, is equal to an L 2 function (p. Then <p( k) = 
ikjip{ k), showing that kjip{ k) belongs to L 2 ( R"). 

Conversely, suppose ip € L 2 (R") has the property that kjip{ k) belongs to 
L 2 (R”). Then dip/dxj, computed in the distribution sense, is equal to the 
L 2 function p~ 1 (ikjp(ip)). 

Proof. Suppose dip/dxj, computed in the distribution sense, is equal to the 
L 2 function <p (see Definition A.28). Then by the unitarity of the Fourier 
transform (Theorem A. 19) and its behavior with respect to differentiation 
(Proposition A.17), we have 

= - {ikjPpx)^^)), 

for all x 6 C)) 0 ^)- Thus, 

( Hx) ,H4>)) = - (^Hx) x e c c °° (R). 

Writing this equality out as an integral, we have 


X(k)(p(k) dk = - / ikjx(k)ip{k) dk 


X(k)ikjip(k ) dk 


(9.17) 


for all x G C~(R”). 
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We now claim that because (9.17) holds for all \ G we must 

have <j>( k) = ikjip( k) for almost every k. Using the Stone-Weierstrass the¬ 
orem and Theorem A. 10, it is not hard to show that the space of smooth 
functions with support in [a, b] is dense in L 2 ([a, b}), for all a < b £ R. 
Since both tp and kjip( k) are locally square-integrable, we see that these 
two functions are equal almost everywhere on [a, b], for all a < b E R, and 
hence equal almost everywhere on R. 

Since ip is globally square-integrable, so is kjip{ k). Furthermore, by the 
injectivity of the L 2 Fourier transform, we have 

^- = <p = r- 1 {ik j FW>)) 

as claimed. 

The argument for the second part of the lemma is similar and left as an 
exercise (Exercise 12). ■ 

Proof of Proposition 9.32. By Proposition 9.30, the operator of mul¬ 
tiplication by kj is an unbounded self-adjoint operator on L 2 (R ra ), with 
domain equal to the set of <p for which kj(p( k) belongs to L 2 (R”). It then 
follows from the unitarity of the Fourier transform that Pj = hT ~ x is 
self-adjoint on J r_1 (Dom(M^.)), where M denotes multiplication by kj. 
The second characterization of Dom(Pj) follows from Lemma 9.33. ■ 

Proposition 9.34 Define a domain Dom(A) as follows: 


Dom(A) = 


if G i 2 (R B ) |k| 2 ^(k) G L 2 


’)} 


Define A on this domain by the expression 

Atp = — .F -1 (|k| 2 ^(k)), 


(9.18) 


where ip is the Fourier transform of ip and T~ x is the inverse Fourier. 
Then A is self-adjoint on Dom(A). 

The domain Dom(A) may also be described as the set of all ip € L 2 (M n ) 
such that Aip, computed in the distribution sense, belongs to L 2 (M"). If 
ip G Dom(A), then Aip as defined by (9.18) agrees with Aip computed in 
the distribution sense. 


The proof of Proposition 9.34 is extremely similar to that of Proposi¬ 
tion 9.32 and is omitted. Of course, the kinetic energy operator — h 2 A/{2m) 
is also self-adjoint on the same domain as A. It is easy to see from (9.18) 
and the unitarity of the Fourier transform that — K 2 A/film) is non-negative, 
that is, that 


for all ip G Dom(A). 
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Using the same reasoning as in Sects. 9.6 and 9.7, it is not hard to show 
that the operators Pj and A are essentially self-adjoint on C“(K ra ). See 
Exercise 16. 

Care must be exercised in applying Proposition 9.34. Although the func¬ 
tion 

V>(x) := T~7 

M 

is harmonic on R 3 \{0}, the Laplacian over M 3 of ip in the distribution 
sense is not zero (Exercise 13). (It can be shown, by carefully analyzing the 
calculation in the proof of Proposition 9.35, that A ip is a nonzero multiple 
of a (5-function.) This example shows that if a function ip has a singularity, 
calculating the Laplacian of ip away from the singularity may not give the 
correct distributional Laplacian of ip. For example, the function <p in L 2 (M 3 ) 
given by 

e~l x l 2 

0(x) := Vp- (9.19) 

l x l 

is not in Dom(A), even though both cp and A <p are (by direct computa¬ 
tion) square-integrable over M 3 \{0}. Indeed, when n < 3, every element of 
Dom(A) is continuous (Exercise 14). 

Proposition 9.35 Suppose ip(x) = g(x)f( |x|), where g is a smooth func¬ 
tion on M n and f is a smooth function on (0,oo). Suppose also that f 
satisfies 

lim r n_1 /(r) = 0 

r— >0+ 

lim r n_1 /'(r) = 0. 

i —>- 0 + 

If both if and Aij) are square-integrable over M n \{0}, then ^ belongs to 
Dom(A). 


Note that the second condition in the proposition fails if n = 3 and 
f(r) = 1/r. We will make use of this result in Chap. 18. 

Proof. To apply Proposition 9.34, we need to compute (ip, Ay), for each 
X G C'£°(R n ). We choose a large cube C, centered at the origin and such 
that the support of x is contained in the interior of C. Then we consider 
the integral of ip(d 2 x/dx 2 ) over C\C e , where C e is a cube centered at the 
origin and having side-length e. We evaluate the ^-integral first and we 
integrate by parts twice. For “good” values of the remaining variables, ar¬ 
ranges over all of C, in which case there are no boundary terms to worry 
about. For “bad” values of the remaining variables, we get two kinds of 
boundary terms, one involving ip(dx/dxj) and one involving ( dip/dxj)x , 
in both cases integrated over two opposite faces of C s . 

Now, 


dip 

dxj 


|U(I X I) + 9 ( X ) 


df_ Xj_ 
dr r 
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Since the area of the faces of the cube is e n_1 , the assumption on / will 
cause the boundary terms to disappear in the limit as e tends to zero. 
Furthermore, both ip and A ip are in L 2 (R") and thus in L 1 (C'), where in 
the case of A ip, we simply leave the value at the origin (which is a set of 
measure zero) undefined. Thus, integrals of ipA\ and (A ip)x over C\C e 
will converge to integrals over C. Since the boundary terms vanish in the 
limit, we are left with 

{ip, A x ) = (A ip,x) ■ 

Thus, the distributional Laplacian of ip is simply integration against the 
“pointwise” Laplacian, ignoring the origin. Proposition 9.34 then tells us 
that ip € Dom(A). ■ 


9.9 Sums of Self-Adjoint Operators 


In the previous section, we have succeeded in defining the Laplacian A, 
and hence also the kinetic energy operator —h 2 A/(2m), as a self-adjoint 
operator on a natural dense domain in L 2 (R n ). We have also defined the 
potential energy operator V (X) as a self-adjoint operator on a different 
dense domain, for any measurable function V : R" —> R. To obtain the 
Schrodinger operator —h 2 A/(2m) + V(X.), we “merely” have to make sense 
of the sum of two unbounded self-adjoint operators. This task, however, 
turns out to be more difficult than might be expected. In particular, if 
V is a highly singular function, then —h 2 A/(2m) + P(X) may fail to be 
self-adjoint or essentially self-adjoint on any natural domain. 

Definition 9.36 If A and B are unbounded operators on H, then A + B 
is the operator with domain 


Dom(A + B) := Dom(A) fl Dom(I3) 


and given by (A + B)ip = Aip + Bip. 


The sum of two unbounded self-adjoint operators A and B may fail to be 
self-adjoint or even essentially self-adjoint. [If, however, B is bounded with 
Dom (B) = H, then Proposition 9.13 shows that A + B is self-adjoint on 
Dom(A) nDom(B) = Dom(A).] For one thing, if A and B are unbounded, 
then Dom(A) fl Dom(f?) may fail to be dense in H. But even if Dom(A) fl 
Dom(f?) is dense in H, it can easily happen that A + B is not essentially 
self-adjoint on this domain. (See, for example, Sect. 9.10.) Many things that 
are simple for bounded self-adjoint operators becomes complicated when 
dealing with unbounded self-adjoint operators! 

In this section, we examine criteria on a function V under which the 
Schrodinger operator 


H = - 


2m 


A + U 
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is self-adjoint or essentially self-adjoint on some natural domain inside 
L 2 (K"). 

Theorem 9.37 (Kato—Rellich Theorem) Suppose that A and B are 
unbounded self-adjoint operators on H. Suppose that Dom(A) C Dom(i3) 
and that there exist positive constants a and b with a < 1 such that 

\\Bif\\<a\\Aif\\+b\\if\\ (9.20) 

for all if £ Dom(A). Then A + B is self-adjoint on Dom(A) and essentially 
self-adjoint on any subspace of Dom(A) on which A is essentially self- 
adjoint. Furthermore, if A is non-negative, then the spectrum of A + B is 
bounded below by —6/(1 — a). 

Note that since we assume Dorn(B) D Dom(A), the natural domain for 
A + B is Dom(A) D Dom(R) = Dorn(A). An operator B satisfying (9.20) 
is said to be relatively bounded with respect to A, with relative bound a. 
Proof. We use the trivial variant of Theorem 9.21 given in Exercise 8. 
Choose a positive real number p large enough that a + b/p < 1, which is 
possible because we assume a < 1. Then for any if £ Dom(A), we have 

{A + B + ipl)if = (B(A + ipl )~ 1 +1) (A + ipl)if. (9.21) 

For any if G H, we compute that 

||.B(A + ipl)~ x if\\ < a 11 ^4(^4 + i/Lt/) _1 ^|| + b ||(A + ipl) -1 'if\\ 



Here we have made use of the estimates 

\\A(A + ipiy 1 ]] < 1, ||(A + ipl)~ 1 \\ < —, 

T 

both of which are elementary (Exercise 17). 

If C denotes the operator B{A + ipl ) -1 , (9.22) tells us that ||C|| < 
(a + b/p) < 1. Thus, by Lemma 7.6, C + 1 is invertible. Furthermore, since 
A is self-adjoint, A + ipI maps Dom(A) onto H. Thus, (9.21) tells us that 
A + B + ipl also maps Dom(A) onto H. The same argument shows that 
A + B — ipl maps Dom(A) onto H and we conclude, by Exercise 8, that 
A + B is self-adjoint on Dom(A). 

Suppose, in addition, that A is non-negative. Let us replace ip by A > 0, 
in (9.21). Calculating as in (9.22), using the estimates in Exercise 18, we 
obtain that 

\\B(A + A/) - V|| < («+x) IMI 

for all if £ H. If A > 6/(1 — a), then a + 6/A < 1, and by the above 
argument, Range(A + B + XI) = H. Furthermore, since A + B + XI is self- 
adjoint, Proposition 9.12 tells us that ker(A + B + XI) = {0}. This shows 
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that A + B + XI is invertible and —A is in the resolvent set of A + B. We 
conclude, then, that the spectrum of A+B is contained in [—b/ (1 —a), +oo). 

The last part of the theorem, concerning essential self-adjointness, is left 
as an exercise (Exercise 19). ■ 

Theorem 9.38 Suppose n is at most 3 and V : R ra —> R is a measur¬ 
able function that can be decomposed as a sum of two real-valued, mea¬ 
surable functions Vj and Vj, with Vj belonging to L 2 (R n ) and V 2 being 
bounded. Then the Schrodinger operator —h 2 A/(2m) + V(X.) is self-adjoint 
onDom(A). Furthermore, — h 2 A/(2m) + V(X) is bounded below. 

Implicit in the statement of the theorem is that Dom(V(X)), as given 
in Proposition 9.30, contains Dom(A). A result similar to Theorem 9.38 in 
R”, n > 4, but the condition that Vj belongs to L 2 (R”) is replaced by the 
condition that Vj belongs to L p (M. n ) for some p > n/2. See Theorem X.20 
in Volume II of [34], 

Proof. We apply the Kato-Rellich theorem with A = — h 2 A/2m and B = 
V(X). Assume ip € Dom(A) and fix some e > 0. By Exercise 14, there 
exists a constant c e such that 

IV’OOI < e\\Aip\\+c e ||V>|| 

for all x £ R”. Thus, if V is as in the theorem and ip £ Dom(A), 

\\Vip\\ < sup |^(x)| ||Vj|| + sup |Vj(x)| \\ip\\ 

< £ ||Vi|| || A^|| + (c e ||Vj|| + sup |Vj(x)|) \\ip\\ • 

This shows that Dom(V(X)) D Dom(A). Since £ is arbitrary, we can 
arrange for the constant in front of ||A^|| to be less than one and the 
Kato-Rellich theorem applies. ■ 

Theorem 9.39 Suppose n is at most 3 and V : M" —> M is a measur¬ 
able function that can be decomposed as a sum of three real-valued, mea¬ 
surable functions Vj, Vj, and V 3 , with Vj belonging to L 2 (R"), Vj being 
bounded, and Vj being non-negative and locally square-integrable. Then 
the Schrodinger operator —H 2 A/(2m) + V(X) is essentially self-adjoint on 

c c °°(r). 

The proof of this result would take us too far afield and is omitted. See 
Theorem X.29 in Volume II of [34], Note that we assume only that Vj is 
non-negative and locally square-integrable; Vj can tend to +00 arbitrarily 
fast at infinity. Again, the same result applies in M", n > 4, if the condition 
on Vj is replaced by the assumption that Vj £ L p (R ra ) for some p > n/2. 

Proposition 9.40 Fix a and b in R n and let a • X + b • P denote the 
operator given by 

(a • X + b ■ P)-ij(x) = (a • x)ip(x) — ih bj —. 

3 =1 Xj 
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Then a ■ X + b ■ P is essentially self-adjoint on C/°(M n ). 

Proof. We use the same strategy as in Sect. 9.7, namely we explicitly 
solve the equation A* ip = zkiip and find that there are no nonzero, square- 
integrable solutions. 

The case b = 0 is not hard to analyze and is left as an exercise (Ex¬ 
ercise 20). Assume, then, that b / 0. By making a rotational change of 
variables, we can assume that b = aei and a = Pei + 702 , so that 

(Aip)(pc) = (px 1 + 7 x 2 )^(x) - iha (9.23) 


(If n = 1, the 7 x 2 term is not present.) As in the proof of Proposition 9.29, 
the adjoint A* of A will be given by the same formula as A, with Dom(A*) 
consisting of those elements ip of L 2 (R") for which the right-hand side of 
(9.23), computed in the distributional sense, belongs to L 2 (R n ). 

We now apply the criterion for essential self-adjointness in Corollary 9.22. 
We need to show that the equations A*ip = iip and A*ip = —iip have no 
nonzero solutions in Dom(A*). After rewriting the equation A*ip = iip as 

= ~-j~(Pxi +7Z 2 )V’(x) - (9.24) 

ox 1 ho. ho 

we can easily find the general distributional solution as 

1 

xix 2 - — 2:1 
ah 



1 p(x) =c(x 2 ,..., 


\ 1 0 2 A 


[It is easily verified that if we let (p equal ip divided by the exponential on the 
right-hand side of (9.25), then <p satisfies d(p/dx \ = 0 in the distributional 
sense. Exercise 21 then tells us that (p must be a function of £ 2 , • • • ,x n ■] 
Since the exponential factor is never square integrable as a function of x\ 
with X 2 fixed, the only way that ip can be square integrable is if c is zero 
for almost every value of ( X 2 ,..., x n ), in which case ip is the zero element 
of L 2 (R"). A similar argument shows that the equation A*ip = —iip has no 
nonzero solutions. ■ 


9.10 Another Counterexample 

In this section, we will show that the Schrodinger operator H = P 2 /(2m) — 
X 4 is not essentially self-adjoint on C^R), even though H is certainly 
symmetric. By contrast, P 2 /(2m) + X 4 is essentially self-adjoint, by The¬ 
orem 9.39. The operator P 2 /(2m) — X 4 is a more serious counterexample 
than the one in Sect. 12.2, in that it does not involve any obviously in¬ 
correct choice of boundary conditions. On the other hand, it should not 
be surprising that something goes “wrong” in a quantum system with a 
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potential equal to —x 4 . After all, a classical system with this potential has 
trajectories that go to infinity in finite time (see Exercise 4 in Chap. 2). 

To show that H is not essentially self-adjoint, we will show that the 
adjoint H* is not symmetric. Suppose ip is a C°° function such that both 
ip and the function 

— -^—ip"{x) — x 4 ip{x) (9.26) 

belong to L 2 (R). Using integration by parts, as in the proof of Lemma 9.28, 
we can see that ip is in the domain of H* and H*ip is the function in (9.26). 
We will construct an approximate eigenvector ip £ Dom (H*) for H* with 
an imaginary eigenvalue ia, which will show that H* is not symmetric and 
thus H is not essentially self-adjoint. 


Theorem 9.41 Define an operator H with Dom(14) = C° 
mula 

U = ^^- X 4 
2 m dx 2 


by the for- 


Then H is not essentially self-adjoint. 

In preparation for the proof, let us define a function p(x) on R. such that 

p{x) 2 


2 to 


— x = ia. 


that is, 


p{x) = V2 m\/ x 4 + ia. 


(9.27) 


Here we take the square root that is in the first quadrant. The function 
p(x) represents “the momentum of a classical particle with energy ia.” 


Lemma 9.42 If ip a is given by 
ip a (x) = 

then ip a belongs to L 2 


: exp 


Vp(x) 

and the function 
H 2 d 2 ip, 


P{y) dy 


also belongs to L 2 


2m d- X *° 

). Furthermore, we have 


(9.28) 


(9.29) 


h 2 d 2 
2 to dx 2 


— x — la 


where 


m a (x) = - 


h 2 

1pa(x) = -—1pa{.x)ma{.x), 
2 TO 


-3- 


4 ( x 4 + ia) 2 (x 4 + ia) 
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It will be apparent from the proof that the two terms in (9.29) are not 
separately in L 2 (R). The motivation for the definition of ip a comes from 
the WKB approximation (Chap. 15) with a complex value for the energy. 
Proof. Let us consider the integral of p , 

rx nx 

/ p(y) dy = V2m / \Jy A + ia dy. 

Jo Jo 

Using the power series for (1 + x) a we see that for large y, 

y/y A + ia = yV 1 + ia /v 4 = V‘ 1 ( X + ^ + ° (“s)) • 


From this estimate, it is easy to see that the imaginary part of f*p(y) dy 
remains bounded as x tends to ±oo. It follows that the exponential in the 
definition of ip is bounded, from which it is easy to see that ip is square 
integrable. 

Now, using the formula for the second derivative of a product, we obtain 


_ h 2 — / 


-ft' 


dx 2 

d 2 


p(x) 2 

ih P ' {X) 2 h 2 { 

1 p'(x) \ 

_Vp ( x ) 

Jp{x) l 

2p(x)' 6 / 2 ) 


ip{x) 


dx 2 


exp {ii Md «}- 


(9.30) 


The factor of 1 /y/p(x) in the definition of ip a was chosen precisely so that 
the second and third terms in square brackets will cancel. If we replace 
p 2 {x) in the numerator of the first term by 2 m(x 4 + ia ), we obtain 


-U«W - - t<nc. 




P{y) dy 


It is then an elementary calculation to show that 


^P(x)- 1/2 = P(z)- 1/2 


-( x A +ia ) 2 x 6 — 3(cc 4 + ia) 1 x 2 


from which the lemma follows. ■ 

Proof of Theorem 9.41. If H were essentially self-adjoint, H* (which 
would coincide with H cl ) would be self-adjoint and, in particular, symmetric. 
If this were the case, we would have, by the proof of Lemma 7.8, 

((H* - ial)ip, (H* - ia/)V>) > a 2 {if>, ip) (9.31) 

for all ip £ Dom(J7*) and a £ R. But if ip a is the function in Lemma 9.42, 
the discussion preceding Theorem 9.41 shows that ip a belongs to Dom(IL*). 
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Furthermore, it is easily verified that there is a constant C such that 
|m a (a;)| < C for all a > 1 and x £ R. Thus, for all sufficiently large 
a , we have 


(H* - ial)ip c 


< 


H 4 

4 m 2 


C 2 11V’ct11 2 < ot 2 |h p a f 


contradicting (9.31). ■ 

See Exercise 22 for a more explicit approach to showing that H* is not 
symmetric. 


9.11 Exercises 


1. Show that an unbounded operator A fails to be closable if and only 
if the closure of the graph of A contains an element of the form (0, ip) 
with ip ^ 0. 


2 . 


Define an unbounded operator A on L 2 ([ 0,1]) with domain Dom(A) = 
C([0,1]) by 

Af = f( 0)1, 


where 1 is the constant function. Show that A is not closable. 


3. Prove Proposition 9.13. 

4. Suppose that A is an unbounded self-adjoint operator on H and that 
numbers \ n in cr(A) converge to some A £ R. Using Point 1 of Propo¬ 
sition 9.18, show that A € cr(A). 

5. Suppose A is a closed operator on H. Show that the kernel of A is a 
closed subspace of H. 


6 . 


Suppose A is a closed operator on H. Define a norm 
by 

M 1 = M + PV’II- 


on Dom(A) 


Show that Dom(A) is a Banach space with respect to H-j^. 


7. Let A be an unbounded operator on H. 


(a) Show that if A is symmetric, then A cl is also symmetric. 

(b) Show that if B is an extension of A, then A* is an extension of 
B*. 

(c) Suppose A is self-adjoint and B is an extension of A. Show that 
if B is symmetric, then Dom(A) = Dom(i3). (That is to say, a 
self-adjoint operator has no proper symmetric extensions.) 
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8 . Fix a positive real number /. i. 

(a) Show that a symmetric operator A is self-adjoint if and only if 
Range(Al + ifxl) and Range(A — i/j,I) are equal to H. 

(b) Show that a symmetric operator A is essentially self-adjoint if 
and only if Range(A-H/x7) and Range(Al — ifj,I) are dense in H. 

9. Let A be the operator considered in Sect. 9.6. Using Lemma 9.28, 
show that for each A G C, there exists ip G Dom(A*) with A*ip = Xip. 
Conclude that each A G C belongs to the spectrum of A cl . 

Hint : Recall that (. A cl )* = A*. 

10. Let A be the operator considered in Sect. 9.6 and suppose ip is in the 
domain of A cl . Then there exists a sequence ip n in Dom(7l) such that 
ip n converges to ip in L 2 ([0,1]) and such that Aip n converges to some 


X in L 2 ([0,1]). 
(a) Show that 



for all x G [0,1]. 

(b) Show that ip n converges uniformly to the function 

ip{x) = i (l[o, x ])X) • 


(c) Conclude that ip is continuous and satisfies ip(0) = ip( 1) = 0. 

11. Take H = L 2 ((0,oo)) and let A be the operator —i d/dx, with 
Dom(A) consisting of those smooth functions that are supported on 
a compact subset of (0, oo). (Such a function is, in particular, zero on 
(0, e) for some e > 0.) Show that A is symmetric and that A* + H is 
injective but that A* — il is not injective. 

Hint: Imitate the arguments in the proof of Propositions 9.27 and 9.29. 

12. Prove the second part of Lemma 9.33. 

13. Let x be a smooth, radial function on R 3 such that for |x| < 1 we 
have x( x ) = 1 , for |x| > 2 we have x( x ) = 0 , and for 1 < |x| < 2 , we 
have d\/dr < 0. Show that 



which shows that the Laplacian of 1/ |x|, in the distribution sense, is 
not zero. 
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Hint: Let E = Ci\C 2 , where C\ is a cube centered at the origin with 
side length 3 and where C 2 is a cube centered at the origin with side 
length 1/2. Then E contains the support of A\- Using integration by 
parts on E, show that 

L h A x(x) d * = ~ L’’ 7 (wi) ' Vx<x) iyL 


14. Let Dom(A) C L 2 (M") denote the domain of the Laplacian, as given 
in Proposition 9.34, and assume n < 3. 


(a) Show that each xp £ Dorn(A) is continuous and that there exists 
constants C\ and C 2 such that 


IV’WI < ci 


c 2 


• |9/5 


V’(k) 


for all xp £ Dom(A). 

Hint'. Show that xp is in L 1 by expressing xp as the product of 
two L 2 functions. 

(b) Show that for any e > 0, there exists a constant c e such that 

|tH x )l < c e Ill'll +e||A^|| 
for all xp £ Dom(A). 

15. Recall the definitions of Dom(//) and Dom(A) in Sect. 9.8. Let 
Dom(P j 2 ) be the set of all xp belonging to Dom(P J ) such that PjXp 
again belongs to Dom(P J ). Show that 


P|Dom(P J 2 ) = Dom(A). 

i=1 


16. Let Qj denote the restriction to C/°(K") of the momentum operator 
Pj. Show that Dom(Q*) = Dom(P / ). Conclude that Qj is essentially 
self-adjoint. 

17. Let A be an unbounded self-adjoint operator on H and let /1 be a 
nonzero real number. 

(a) Show that || {A + 1| < 1/ |/z|. Note that (A + i^t/) -1 exists, 

by Theorem 9.17. 

(b) Show that for all xp £ H, 

IIV’H 2 = || A(A + iiu,I)~ 1 xp\\- + n 2 ||(A + iiil)~ 1 xp\\~. 
Conclude that || A(A + 1| < 1. 
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18. Let A be an unbounded self-adjoint operator on H. Suppose A is 
non-negative (Definition 9.19) and let A be a positive real number. 

(a) Show that || (A + A/) -1 1| < 1/A. 

(b) Show that for all (itH, 

\\ipf > || A(A + A/)“Vf + A 2 || (A + A /)“V|| 2 . 
Conclude that ||a!(A + A/) -1 1| < 1. 

19. Prove the last part of Theorem 9.37, concerning domains of essential 
self-adjointness. 

Hint: If A is self-adjoint on Dom(A) and V C Dom(Al) is a dense 
subspace of H, then A is essentially self-adjoint on V if and only if 
the closure of A\ v is equal to A. 

20. Let A be the operator b-Xon the domain C'/ 0 (M n ), for some b £ R". 

(a) Using the definition of the adjoint of an unbounded operator, 
show that Dom(A*) consists of all those ip in L 2 (R") for which 
the function (b • x)^(x) again belongs to L 2 (R n ). 

(b) Using Proposition 9.30, show that A is essentially self-adjoint. 

21. (a) Show that a function (p £ C/°(R n ) can be expressed as <p = 

dx/dxi for some x € C/°(R n ) if and only if <p satisfies 



for all (x 2 , ■ ■ ■, x n ). 

(b) Fix a function 7 £ C/°(R) such that fT^jix) dx = 1. Show 
that any cp £ C%°( M") can be expressed as 



for some x £ (^/“(R”), where / is the element of (^(R" - 1 ) 
given by 



(c) 


Suppose T is a distribution on R” with the property that 
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Define a distribution c on R” 1 by the formula 
c(/) = T(f(x 2 ,..., x n )'){x\)). 

Show that for all (f> £ C£°(R n ) we have 

T{ft>) = c(4>), 

where (j) £ C£° (R ra_1 ) is given by 

<f>(x 2 ,...,Xn)= / 4>{x 1 ,x 2 , ■ ■ ■ ,X n ) d,X\. 

J R 


22. Let H denote the Schrodinger operator in Theorem 9.41 and let ijj a 
be the function defined in Lemma 9.42. 

(a) Show that 




f Mm 

2777/ A—^oo 




— A 




A ' 
-A 


(b) Now show by direct calculation that 
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The Spectral Theorem for Unbounded 
Self-Adjoint Operators 


This chapter gives statements and proofs of the spectral theorem for 
unbounded self-adjoint operators, in the same forms as in the bounded 
case, in terms of projection-valued measures, in terms of direct integrals, 
and in terms of multiplication operators. The proof reduces the spectral 
theorem for an unbounded self-adjoint operator A to spectral theorem for 
the bounded operator U := (A + iI)(A — */) _1 (Sect. 10.4). This bounded 
operator is, however, not self-adjoint but rather unitary. Thus, before com¬ 
ing to the proof of the spectral theorem for unbounded self-adjoint op¬ 
erators, we prove (Sect. 10.3) the spectral theorem for bounded normal 
operators, those that commute with their adjoints. (A unitary operator U 
certainly commutes with its adjoint U* = U ~ l .) The proof for a bounded 
normal operator B is the same as for bounded self-adjoint operators, ex¬ 
cept for the step in which we approximate continuous functions on cr(B) 
by polynomials. Since cr(B) is not necessarily contained in M, we need to 
use the complex version of the Stone-Weierstrass theorem, which requires 
us to consider polynomials in A and A. We must then prove a strengthened 
version of the spectral mapping theorem before proceeding along the lines 
of the proof for bounded self-adjoint operators. 

In Sect. 10.2, we discuss Stone’s theorem, which gives a one-to-one corre¬ 
spondence between strongly continuous one-parameter unitary groups and 
self-adjoint operators. One direction of Stone’s theorem follows from the 
spectral theorem, that is, from the functional calculus that results from the 
spectral theorem. 
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10. The Spectral Theorem for Unbounded Self-Adjoint Operators 


10.1 Statements of the Spectral Theorem 

The statement of the spectral theorem—in any of the forms that we have 
considered—is almost the same for unbounded self-adjoint operators as for 
bounded ones. The only difference is that the statement of the theorem in 
the unbounded case has to contain some description of the domain of the 
operator. 

Recall that if p is a projection-valued measure on (A', f2) with values in 
H) and ip is an element of H, then we can construct a non-negative, 
real-valued measure p^, from p by setting p^(E) = (ip, p(E)ip), for each 
measurable set E. To motivate the following definition, consider integration 
of a bounded measurable function / against a projection-valued measure /i. 
Since the integral is multiplicative and complex-conjugation of a function 
corresponds to adjoint of the operator, we have 

(a ’ d> ) a f di ) *) = ( ,p ' (/*" •“^ *) 

= [ l/l 2 d^. (10.1) 

Jx 

Suppose, now, that / is an unbounded measurable function on X and we 
wish to define f x f dp, which will presumably be an unbounded operator. 
It seems reasonable to define the domain of / to be the set of ip for which 
the right-hand side of (10.1) is finite. 

Proposition 10.1 Suppose p is a projection-valued measure on (A', fl) 
with values in B(H) and f : X —> C is a measurable function (not nec¬ 
essarily bounded). Define a subspace Wf of H by 

£|/(A)| 2 d/ty(A) <ooj. (10.2) 

Then there exists a unique unbounded operator on H with domain Wf — 
which is denoted by f x f dp —with the property that 

f d T S j^j = j fW dp^(X) 

for all ip in Wf. This operator satisfies (10.1) for all ip G Wf. 

Note that since p^ is a finite measure for all ip, if / is bounded then the 
domain of f x f dp is all of H. Thus, in the bounded case, the definition of 
f x f dp in Proposition 10.1 agrees with our earlier definition (in Chap. 7) 
of the integral. This means, in particular, that if / is a bounded function, 
f x f dp is a bounded operator. Proposition 10.1 follows immediately from 
the following result. 


W f = { ip e H 
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Proposition 10.2 Let f be a measurable function on X and let Wf be as 
in (10.2). Then the following results hold. 

1. The space Wf is a dense subspace of H and the map Qf : Wf —► C 
given by 

QfW = [ /(A) A) 

J X 

is a quadratic form on Wf. 

2. If Lf is the associated sesquilinear form onWf, we have 

\L f ((j),ip)\ < ||</>|| ||/|| 

L 2 (X ,/a/,) (10.3) 

for all <f, if € Wf. 

3. For each if € Wf, there is a unique x S H such that Lf(<p, if) = (<f, x) 

for all (f € Wf. Furthermore, the map if i-t x linear and for all 

if € Wf, we have 

llxll 2 = [ I/I 2 (10.4) 

J x 

Proof. It is easy to see that Wf is closed under scalar multiplication. To 
show that it is closed under addition, note that since p{E) is self-adjoint 
and satisfies p{E) 2 = p{E), we have 

H+4>(. e ) = \\p(E)(<f + if)\\ 2 

<(\\p(E)cf\\+\\p(E)if\\) 2 
<2\\p{E)<f\\ 2 + 2\\p{E)if\\ 2 
— 2 P(f,(E) + 2p^,(E), 

where in the third line we have use the elementary inequality (x + y) 2 < 
2x 2 + 2 y 2 . 

To show that Wf is dense in H, let E n = {x G X \ \f(x)\ < n}. If if G 
Range (p(E n )), then p^(E!)) = 0, and, thus, 

[ \f\ 2 dp^= f |/| 2 dp^, < n 2 p^(E n ) < oo, (10.5) 

J x Je„ 

showing that if belongs to Wf. Since also U n E n = X, the union of the 
ranges of the p{E n )' s is dense and contained in Wf. 

If / is bounded, Qf may be computed as 

Qfd ’) = (if, (^J f dpj tfj , if & H, 


204 


10. The Spectral Theorem for Unbounded Self-Adjoint Operators 


where j x f dfi is as in Chap. 7. Thus, Qf is a quadratic form for which the 
associated sesquilinear form is 



cp,ip e H. 


This form satisfies 


|£/W’>'0)l < II0II / d/x) 
= ll^ll II/IIl 2 (x, W ) • 


( 10 . 6 ) 


for all (f>, ip £ H, where in the second line we have used (10.1). 

If / is unbounded and ip belongs to Wf, let /„ = f^E n - Then Qf{ip) = 
limn-^oo Qf n (ip), by monotone convergence, in which case, it is easy to 
see that Qf is still a quadratic form and that (10.6) still holds for all 
<p £ H. From (10.6), we see that for each ip £ Wf, the conjugate-linear 
functional (p i-)- Lf(<p,ip) is bounded. Thus, by (the complex-conjugate 
of) the Riesz theorem, there is a unique vector x such that Lf(<p,ip ) = 
{(p,x)- Furthermore, (10.6) tells us that ||x|| < ||/|| L2 ( X y Conversely, 
since Lf(<p,ip) = (10.6) is an equality when <p> = x, showing that 

llxll > WfWmx^y Finally, the map ip i —> x is linear because Lf(<p,ip ) is 
linear in ip. ■ 

Proposition 10.3 If f is a real-valued, measurable function on X, then 
f x f dfi is self-adjoint onWf. 


Proof. Let Af — f x f dfi. Define subsets F n of X by 
F n = {x £ X \ n-l< \f(x)\ < n} , 


so that X is the disjoint union of the F n ’ s, and let W n = Rang e(n(F n )). As 
in the proof of Proposition 10.2, any ip £ W n is in Wf, and the quadratic 
form Qf is bounded on W n [compare (10.5)]. Furthermore, if <p £ ( W n ) ± 
and ip £ W n , it is straightforward to check that = n<j, + /q/, and so 


Qf{<t> + VO = Qf{4>) + QfW- 


(10.7) 


From (10.7), we obtain, by the polarization identity, 
(cp,A f ip) = L f ((p,ip) = 0. 


This shows that A ftp belongs to ( W n )- LJ - = W n . 

We conclude that Af maps W n boundedly to itself. Indeed, the restric¬ 
tion to W n of Af coincides with the restriction to W n of the bounded 
operator obtained by integrating /1 f„ with respect to /r (compare the 
quadratic forms). Furthermore, since Qf is real-valued, the restriction of 
Af to W n is self-adjoint (Proposition A.63). 
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Now, H is the orthogonal direct sum of the W n ’ s, meaning that H may be 
identified with the set of infinite sequences {ipi,^,^, ■ ■ ■) with ip n G W n 
and such that 

OO 

E ii^«n 2 < °° • 

n= 1 

If A n denotes the restriction of Af to W n , then under this decomposition 
of H, we have 


W f 


ip gh 


iiai^m 2 < oo 


n= 1 


%p= (tp 1 ,1p 2 ,---) 


E 

n —1 



To verify (10.8), we note that 




I/I 2 d^ ll^V’nll 2 ■ 

n= 1 


( 10 . 8 ) 


(10.9) 


The first equality is by monotone convergence and the second holds because 
on W n . In particular, the first quantity in (10.9) is finite if and 
only if the last quantity if finite. 

By a similar argument, for ip G Wf, we have 


Qf{4 0 = / /(A) d/v(A) = V (ip n ,A n ip n ), 

Jx n=1 

from which it follows that 


OO 

L f(<P, 'P) = E Ani’n) 

n—1 

for all <p,ip G Wf. From this we see that A ftp is the vector represented by 
the sequence (Aiipi,A 2 ip 2 , ■ ■ ■)■ It then follows from Example 9.26 that Af 
is self-adjoint. ■ 


Theorem 10.4 (Spectral Theorem, First Form) Suppose A is a 
self-adjoint operator on H. Then there is a unique projection-valued measure 
p A on cr(A) with values in B( H) such that 



( 10 . 10 ) 


Since the spectrum of A is typically an unbounded set, the function 
/(A) = A is an unbounded function on a(A). Note also that the equality 
in (10.10) includes, as always, equality of domains. That is, the domain of 
the integral on the left-hand side, namely the space Wf in Proposition 10.1, 
coincides with Dom(A). The proof of this theorem is given in Sect. 10.4. 
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Definition 10.5 (Functional Calculus) For any measurable function f 
on cr(A), define a (possibly unbounded) operator, denoted f(A), by 

f(A) = [ /(A) dp A ( A). 

J cr{A) 

As usual, we can extend the projection-valued measure p A from cr(A) to 
R by setting p A equal to zero on the complement of cr(A). 

Definition 10.6 (Spectral Subspaces) If A is a self-adjoint operator 
on H, then for any Borel set E C R, define the spectral subspace Ve 
of H by 

V E = Range (p A (E)). 

Definition 10.7 (Measurement Probabilities) If A is a self-adjoint 
operator on H, then for any unit vector if € H, define a probability measure 
p A on R by the formula 

= (i’,F A (E)ijj) . 

If the operator A represents some observable in quantum mechanics, 
then we interpret p A to be the probability distribution for the result of 
measuring A in the state if. 

Proposition 10.8 Let A be a self-adjoint operator on H. Then the spectral 
subspaces Ve associated to A have the following properties. 

1. If E is a bounded subset of R, then V E C Dom(A), V E is invariant 
under A, and the restriction of A to Ve is bounded. 

2. If E is contained in (Ao — e, Ao + e), then for all if G Ve, we have 

\\(A-X 0 I)if\\<e\\if\\. 


Proof. Point 1 holds because the function /(A) = A is bounded on E. (See 
the proof of Proposition 10.3.) Point 2 then holds because, as in the proof 
of Proposition 10.3, the restriction of A to Ve coincides with the restriction 
to Ve of the operator f(A), where /(A) = A1e(A). ■ 

Theorem 10.9 (Spectral Theorem, Second Form) Suppose A is a 
self-adjoint operator on H. Then there is a a-finite measure p on cr(A), 
a direct integral 

[ H a dp{ A), 

Ja{A) 

and a unitary map U from H to the direct integral such that: 

as r h a dp( a ) 

Ja(A) 




U(Dom(A)) 
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and such that 

(■ UAU~\s )) (A) = As(A) 

for all s £ U(Dom(A)). 

Theorem 10.10 (Spectral Theorem,Multiplication Operator Form) 

Suppose A is a self-adjoint operator on H. Then there is a a-finite measure 
space (X,p), a measurable, real-valued function h on X, and a unitary map 
U : H — > L 2 (X, fT) such that 

U (Dom(A)) = {if £ L 2 (X , p)\hip £ L 2 (X, p) } 

and such that 

(UAU~ 1 (ip))(x) = h(x)ip(x) 
for all if £ t/(Dom(T)). 

These theorems are also proved in Sect. 10.4. 


10.2 Stone’s Theorem and One-Parameter Unitary 
Groups 

In this section we explore the notion of one-parameter unitary groups and 
their connection to self-adjoint operators. We assume here the spectral 
theorem, the proof of which (in Sect. 10.4) does not use any results from 
this section. 

Definition 10.11 A one-parameter unitary group on H is a family 
U(t), t £ M, of unitary operators with the property that 1/(0) = / and that 
U(s+t) = U(s)U(t) for all s,i£l. A one-parameter unitary group is said 

to be strongly continuous if 

lim II U(t)ip — U(s)if II = 0 (10.11) 

S—>t 

for all ip £ H and all t £ K. 

Almost all one-parameter unitary groups arising in applications are 
strongly continuous. 

Example 10.12 Let H = L 2 (R") and let U a (t) be the translation operator 
given by 

( U a (t)ip) (x) = ip(x + ta). 

Then U(-) is a strongly continuous one-parameter unitary group. 


( 10 . 12 ) 
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Proof. It is easy to see that U a ( •) is a one-parameter unitary group. To see 
that U a (-) is strongly continuous, consider first the case in which ip is 
continuous and compactly supported. Since a continuous function on a 
compact metric space is automatically uniformly continuous, it follows that 
ippx. + ta.) tends uniformly to ^(x) as t tends to zero. Since also the support 
of ip is compact and thus of finite measure, it follows that ipfx. + tap tends 
to ipfx) in L 2 (R”) as t tends to zero. 

Now, the space C c (]R ra ) of continuous functions of compact support is 
dense in L 2 (R”) (Theorem A.10). Thus, given e > 0 and ip £ L 2 (R n ), we 
can find <p £ C c (R") such that \\ip — </>|| i 2( R ) < e/3. Then choose 6 so that 
|| U a (a)p — p\\ < e/3 whenever |a| < <5. Then given t £ HR, if \t — s| < <5, we 
have 


\\U a (t)ip-U a (s)iP\\ 

< || U a {t)ip - U a {t)p || + \\U a {t)p - U a (s)p\\ + \\U a (s)p - U a (s)ip\\ 

= \\U a (t)(iP - P)\\ + ||f7 a (s) (U a (t - s)p - P)\\ + || U a (s)(p - ip)\\. (10.13) 

Since U a {t) and U a (s) are unitary, we can see that each of the terms on the 
last line of (10.13) is less than e/3. ■ 

Note that for a ^ 0 the unitary group U a (-) in Example 10.12 is not 
continuous in the operator norm topology. After all, given any e ^ 0, we 
can take a nonzero element ip of L 2 (R ra ) that is supported in a very small 
ball around the origin. Then U a {e)ip is orthogonal to ip and has the same 
norm as ip, so that 

\\U a (e)iP - U a (0)ip\\ = \\U a (e)ip - p\\ = y/2 ||^||. 

Thus, ||C/ a (e) — tT a (0) || > a/ 2 for all e / 0. 


Definition 10.13 IfU(-) is a strongly continuous one-parameter unitary 
group, the infinitesimal generator ofU(-) is the operator A given by 


Af = lim 

t-fO i t 


(10.14) 


with Dom(A) consisting of the set of ip £ H for which the limit in (10.14) 
exists in the norm topology on H. 


The following result shows that we can construct a strongly continuous 
one-parameter unitary group from any self-adjoint operator A by setting 
U{t) = e lAt . Furthermore, the original operator A is precisely the infinites¬ 
imal generator of U(t). 

Proposition 10.14 Suppose A is a self-adjoint operator on H and let U(-) 
be defined by 

U(t ) = e itA , 

where the operator e ltA is defined by the functional calculus for A. Then 
the following hold. 
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1. [/(•) is a strongly continuous one-parameter unitary group. 

2. For all ip £ Dom(A), we have 


Af = lim tmtzA, 

t->o i t 

where the limit is in the norm topology on H. 

3. For all ip £ H, if the limit 


lim i WM 

t-vo i t 


exists in the norm topology on H, then ip £ Dom(^4) and the limit is 
equal to Aip. 

Proof. Since a (A) C R, the function /(A) := e ltx is bounded on a (A) and 
satisfies /(A)/(A) = 1 for all A £ u(A). Thus, the operator f(A) is bounded 
and satisfies 

f(A)f(A)* = f(A)*f(A) = J, 

which shows that f(A) = e ltA is unitary. The multiplicativity of the func¬ 
tional calculus then tells us that U(-) is a one-parameter unitary group. To 
see that U(t) is strongly continuous, note that 


\\U(t)ip — U(s)ip\\ 2 


(ip, - U(s)*)(U(t) - U(s))ip) 

/ °° 9 

\e itx -e isX \ d» A ( A). 

-OO 


(10.15) 


The integral on the right-hand side of (10.15) tends to zero as s approaches 
t, by dominated convergence. 

For Point 2, from recall from Theorem 10.4 that A = A dp, A ( A), and 

take ip £ Dom(Al). Then, by (10.4), we have 


1 U(t)ip — ip 
i t 




1 e itx - 1 
i t 



d»$( A). 


(10.16) 


If we write the function e ltx — 1 as the integral of its derivative with respect 
to A, starting at A = 0, we can see that \(e ltx — l)/t\ < A. Meanwhile, 
since ip is in the domain of the operator A = Jf° A dfi A ( A), we have 
fZoX 2 dg. A (\) < oo. Thus, we may apply dominated convergence, with 
4A 2 as our dominating function, to show that the right-hand side of (10.16) 
tends to zero as t tends to zero. 
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For Point 3, let B be the infinitesimal generator of U (•). If <p and ip belong 
to Dom(B), then 

t~* 0 \ % t 

t -¥0 \ l t 

= /lE7(-(y-0 

t—>0 \ i (-t) ’ r 

= {B(p, Ip) ■ 


Thus, B is symmetric. On the other hand, Point 2 shows that B is an 
extension of A, so by Exercise 7 in Chap. 9 , B = A (with equality of 
domain). ■ 


Theorem 10.15 (Stone’s Theorem) Suppose [/(•) is a strongly contin¬ 
uous one-parameter unitary group on H. Then the infinitesimal generator 
A ofU{-) is densely defined and self-adjoint, and U(t ) = e ltA for all i £ K. 


If [/(•) is a strongly continuous one-parameter unitary group, then [/(•) 
is continuous in the operator norm topology if and only if the infinitesimal 
generator of U(-) is a bounded operator (Exercise 1). As Example 10.12 
suggests, most one-parameter unitary groups that arise in applications are 
not continuous in the operator norm topology. 

Before giving the proof of Stone’s theorem, let us work out the generator 
of the group in Example 10.12. 


Example 10.16 If U a (-), a G is the strongly continuous one- 

parameter unitary group in Example 10.12, then each ip G C)? 0 (K 11 ) is in 
the domain of the infinitesimal generator A ofU a {-) and for all such ip, we 
have 

Aip = — i\^ a, (10.17) 

Z —' OXn 

3 J 

Furthermore, A is essentially self-adjoint on C(?°(]R”). 


Proof. The formula for the infinitesimal generator is easy to establish for 
ip in C^°(]R n ). The essential self-adjointness of A is a special case of Propo¬ 
sition 13.5 (the proof of which is similar to the proof of Proposition 9.29). 


We now establish two intermediate results before coming to the proof of 
Stone’s theorem. 

Lemma 10.17 Let U(-) be a strongly continuous one-parameter unitary 
group and let A be its infinitesimal generator. If ip G Dom(A), then for all 
t G K, the vector U(t)ip belongs to Dom(A) and 

U {t + h)ip — U (t)ip 

inn - 

h—¥ 0 


h 


iU(t) Aip = iAU(t)ip. 


(10.18) 
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Note that Lemma 10.17 tells us that the curve ip(t) := U(t)ipo in H 
satisfies the differential equation 



in the natural Hilbert space sense, provided that ipo belongs to Dom(A). 
This result, together with Proposition 10.14, tells us that if ipo G Dom(lT), 


then the curve ip{t) := e~ ltH ! h ipQ indeed solves the Schrodinger equation 
in the Hilbert space sense. 

Proof. We compute that 


U(t + h)4>-uw = u{t) [um-Tp] 


(10.19) 


Since ip G Dom(A), the limit as h tends to zero of (10.19) exists and is 
equal to iU(t)Aip. On the other hand, 


U(t + h)ip - U(t)ip _ U(h){U[t)ip) - {U(t)ip) 


h 


h 


Thus, the limit as h tends to zero of (10.19) is, by the definition of A, equal 
to iA{U(t)ip). This shows that U(t)ip is in the domain of A and establishes 
the second equality in (10.18). ■ 

Lemma 10.18 For any strongly continuous one-parameter unitary group 
U(-), the infinitesimal generator A is densely defined. 

Proof. Given any continuous function / of compact support, define an 
operator Bf by setting 



Here, the operator-valued integral is the unique bounded operator such 
that 



( 10 . 20 ) 


[It is easy to see that right-hand side of (10.20) defines a bounded sesquilin- 
ear form, for each fixed / G C%° (R).] 

Using the group property of U(-), we see that 
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where in the second line, we have made a change of variable in the first 
term in the integral. From this, we easily obtain that 

= _ r , t 

‘-’■o t 

This shows that Bf 0 is in the domain of A for all 0 £ H and / £ C^°(K). 

Now choose a sequence /„ £ C£°(M) such that f n is non-negative and 
supported in the interval [— 1/n, 1/n] and such that J_^/ n (r) dr = 1. 
Then for any 0 £ H, we have 

pOO 

/ fn{r)[U n {T)^ - 0] dr, 


so that 


/ OO 

fn(r) II U(r)ip - -011 dr 

-OO 


' — OO 

< sup ||£/(r)0 —' 

— l/n<r<l/n 


Since £/(•) is strongly continuous, we see that Bf n ip converges to 0 as 
n —> oo. Thus, every element of H can be approximated by vectors in the 
domain of A. m 

Proof of Theorem 10.15. Suppose U(■) is a strongly continuous one- 
parameter unitary group and A is its infinitesimal generator. By Lemma 
10.18, A is densely defined. As shown in the proof of Proposition 10.14, A 
(denoted by B in that proof) is symmetric. 

Next, we show that A is essentially self-adjoint. Suppose now that 0 
belongs to the kernel of A* — il , i.e., A*0 = *0. Given 0 £ Dom(A), 
set y{t) = (U (f)0,0), so that \y(t)\ < ||0|| ||0||. On the other hand, we 
expect that U(t) = e lAt , so that U(t)* should be e~ lA 0 Thus, y(t) should 
(formally) be equal to (0, e t ip). If this is correct, then since y{t) is a bounded 
function of t , we must have (0,0) = 0. Thus, 0 would be orthogonal to 
every element of a dense subspace of H, showing that 0 = 0. We could 
then similarly argue that ker(A* + il) = {0}, which would show that A is 
essentially self-adjoint. 

To make the argument rigorous, we apply Lemma 10.17, giving 

^ (f7(t)0,0) = (zA[/(t)0,0) = (?t/(t)0, A*0) 

= \iip) = (f/(f)0,0). 

Thus, the function y(t ) := (17(t)0,0) satisfies the ordinary differential 
equation dy/dt = y. The unique solution to this equation is y(t) = y( 0)e*. 
Since y is bounded, we must have 0 = y(0) = (0,0) for all 0 £ Dom(A), 
which implies that 0 = 0. Thus, ker(A* — il) = {0}, and by a similar 
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argument ker(A* + il) = {0}. This shows (Corollary 9.22) that A is essen¬ 
tially self-adjoint. 

We can now construct a strongly continuous unitary group V(-) by set¬ 
ting V(t) = e lAClt . To show that V(-) = [/(•), take if £ Dom(A) C 
Dom(A ci ) and set w(t) = U(t)if — V(t)tp. By Proposition 10.14, the in¬ 
finitesimal generator of V(-) is A cl . Thus, applying Lemma 10.17 to both 
!/(•) and V(-), we have 

^ w(t ) = iAU(t)ip — iAV(t)ip 
= iAw(t ), 

where the limit defining dw/dt is taken in the norm topology on H. Thus, 

; ; r IMOII 2 = ( iAw(t),w(t )) + ( w(t),iAw(t )) 

= —i ( Aw(t),w(t )) + i ( w(t ), Aw(t )) 

= 0, 

because A is symmetric. Since also w(0) = 0, we conclude that w(t) = 0 
for all t. Thus, [/(•) and V(-) agree on a dense subspace and hence on all 
of H. 

We now know that U(t) = e lA 4 . It then follows from Points 2 and 
3 of Proposition 10.14 that the infinitesimal generator of U(-) (namely 
A) is precisely A cl . That is, A = A cl and U{t) = e lAt . Furthermore, we 
have already shown that A is essentially self-adjoint and we now know 
that A = A cl , so A is actually self-adjoint. Finally, if B is any self-adjoint 
operator for which U(t) = e lBt , then by Proposition 10.14, B must be the 
infinitesimal generator of U(-), i.e., B = A. ■ 


10.3 The Spectral Theorem for Bounded Normal 
Operators 

We are going to prove the spectral theorem for an unbounded self-adjoint 
operator by reducing it to the spectral theorem for a bounded operator. 
The reduction, however, will not be to a bounded self-adjoint operator, but 
rather to a unitary operator. Although we proved the spectral theorem only 
for bounded self-adjoint operators, the theorem applies more generally to 
bounded normal operators. (See Exercise 4 in Chap. 7 for the matrix case.) 

Definition 10.19 A bounded operator A on H is normal if A commutes 
with its adjoint: A A* = A* A. 

Every bounded self-adjoint operator is obviously normal. Other examples 
of normal operators are skew-self-adjoint operators (A* = —A) and unitary 
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operators (C/C/* = U*U = I). The spectrum of a bounded normal operator 
need not be contained in R, but can be an arbitrary closed, bounded, 
nonempty subset of C. On the other hand, if U is unitary, then the spectrum 
of U is contained in the unit circle (Exercise 6 in Chap. 7). 

In this section, we consider the spectral theorem for a bounded normal 
operator A. The statements of the two versions of the theorem are precisely 
the same as in the self-adjoint case, except that a (A) is no longer necessarily 
contained in the real line. Almost all of the proofs of these results are the 
same as in the self-adjoint case; we will, therefore, consider only those steps 
where some modification in the argument is required. 


Theorem 10.20 Suppose A E 13(H) is normal. Then there exists a unique 
projection-valued, measure p A on the Borel a-algebra in cr(A), with values 
in B( H), such that 



A d/j A { A) = A. 


Furthermore, for any measurable set E C cr(A), Rang e(fi A (E)) is invariant 
under A and A*. 


Once we have the projection-valued measure p A , we can define a func¬ 
tional calculus for A, as in the self-adjoint case, by setting 


f(A) = [ /(A) dfi A ( A) 

J&(A) 

for any bounded measurable function / on cr(A). 

We can also define spectral subspaces , as in the self-adjoint case, by setting 

Ve ■= Rang e(p A (E)) 

for each Borel set E C cr{A). These spectral subspaces have precisely the 
same properties (with the same proofs) as in Proposition 7.15, with the 
following two exceptions. First, the assertion that Ve is invariant under A 
should be replaced by the assertion that Ve is invariant under A and A*. 
Second, in Point 2 of the proposition, the condition E C [Ao — £, Ao + e] 
should be replaced by E C D{ Ao,e), where D(z,r) denotes the disk of 
radius r in C centered at z. 

Meanwhile, the spectral theorem in its direct integral and multiplica¬ 
tion operator versions also holds for a bounded normal operator A. The 
statements are identical to the self-adjoint case, except that we no longer 
assume a (A) C R and we no longer assume that the function h in the 
multiplication operator version is real valued. 

Let us recall the two stages in the proof of the spectral theorem (first 
version) for bounded self-adjoint operators. The first stage is the construc¬ 
tion of the continuous functional calculus. The steps in this construction are 
(1) the equality of the norm and spectral radius for self-adjoint operators, 
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(2) the spectral mapping theorem, and (3) the Stone-Weierstrass theorem. 
The second stage is a sort of operator-valued Riesz representation theo¬ 
rem, which we prove by reducing it to the ordinary Riesz representation 
theorem using quadratic forms. In generalizing from bounded self-adjoint 
to bounded normal operators, the second stage of the proof is precisely the 
same as in the self-adjoint case. In the first stage, however, there are some 
additional ideas needed in each step of the argument. 

There is a relatively simple argument that reduces the equality of norm 
and spectral radius for normal operators to the self-adjoint case. Mean¬ 
while, since the spectral mapping theorem, as stated in Chap. 8, already 
holds for arbitrary bounded operators, it appears that no change is needed 
in this step. We must think, however, about the proper notion of “polyno¬ 
mial.” For a general normal operator A, the spectrum of A is not contained 
in M, and, thus, powers of A are complex-valued functions on cr(A). We 
must, therefore, use the complex-valued version of the Stone-Weierstrass 
theorem (Appendix A.3.1), which requires that our algebra of functions be 
closed under complex-conjugation. This means that we need to consider 
polynomials in A and A, that is, linear combinations of functions of the 
form X m X n . 

What we need, then, is a form of the spectral mapping theorem that 
applies to this sort of polynomial. On the operator side, the natural coun¬ 
terpart to the complex conjugate of a function is the adjoint of an opera¬ 
tor. Thus, applying the function A m A” to a normal operator A should give 
A m (A*) n . The desired “spectral mapping theorem” is then the following: 
If p is a polynomial in two variables, and A is a bounded normal operator, 
then 

a(p(A,A*)) = {p(X, A)| A G a(A)} . (10.21) 

This statement is true (Theorem 10.23), but its proof is not nearly as 
simple as the proof of the ordinary spectral mapping theorem. One way 
to prove (10.21) is to use the theory of commutative C*-algebras, as in 
[33]. (See Theorem 11.19 in [33] along with the assertion on p. 321 that 
the spectrum of an element is independent of the algebra containing that 
element.) Another approach is the direct argument found in Bernau [3], 
which uses no fancy machinery but which is long and not easily motivated. 
A third approach is to use the spectral theorem for bounded self-adjoint 
operators to help us prove (10.21); this is the approach we will follow. 

We begin with the equality of norm and spectral radius and then turn 
to (10.21). 

Proposition 10.21 If A G £>(H) is normal, then 

\\A\\ = R(A). 

Lemma 10.22 If A and B are commuting elements of 15(H), then 

R(AB) < R(A)R(B). 
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Proof. If A is any bounded operator, the proof of Lemma 8.1 shows that 
for any real number T with T > i?(A), we have 


lim 

m—> oo 


||A m || 


= 0. 


If A and B are two commuting bounded operators and S and T are two 
real numbers, with S > R{A) and T > R(B ), then 


||(AB) m || _ \\A m B m \\ ^ ||A m ||||B m || 

Gjm'J'm Qmj^m — Qjm r £'m 


Thus, 


lim 

m—too 


\\(ABr\\ 

Qm r pm 


= o. 


( 10 . 22 ) 


Meanwhile, if we apply the expression for the resolvent in the proof of 
Lemma 8.1 to AB , we obtain 


00 A m Ft 171 

(AB - A) -1 = - yj “7^+r, (10.23) 

m —0 

since A and B commute. For any Ai with |Ai| > R(A)R(B), take A 2 with 
|Ai| > |A 2 | > R(A)R(B). The terms in (10.23) with A = A 2 tend to zero 
by (10.22), which means that (10.23) converges with A = Ai. Thus, Ai is 
in the resolvent set of AB. m 

Proof of Proposition 10.21. For any bounded operator, ||A|| > R(A) 
(Proposition 7.5). To get the inequality in the other direction, recall (Propo¬ 
sition 7.2) that ||A|| 2 = ||A*A||. Note also that A*A is self-adjoint, since its 
adjoint is A* A** = A* A. Thus, if A and A* commute, we have 

||A|| 2 = \\A*A\\ = R(A*A) < R(A*)R{A) 

<\\A*\\R(A) = \\A\\R(A). 

Here we have used Lemmas 8.1 and 10.22 and the general inequality be¬ 
tween norm and spectral radius. Dividing by ||H|| gives ||A|| < R(A), unless 
||A|| = 0, in which case the desired inequality is trivially satisfied. ■ 

Theorem 10.23 If A £ H(H) is normal, then for any polynomial p in two 
variables, we have 


<r (p{A, A*)) = {p(A, A)| A e <y(A)} . 

If, for example, p( A, A) = A 2 A 3 , then p(A. A*) = A 2 (A*) 3 . Note that since 
A and A* are assumed to commute, the map sending the polynomial p(A, A) 
to p(A,A*) is an algebra homomorphism. That is to say, ( pq)(A,A *) = 
p(A, A*)q(A, A*). This would not be the case if A did not commute with A*. 
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We begin by proving Theorem 10.23 in the case that A is a normal 
matrix. Although the matrix case is quite simple, it provides an outline for 
our assault on the general result. 

Proof of Theorem 10.23 in the Matrix Case. For matrices, the spec¬ 
trum is nothing but the set of eigenvalues. If A commutes with A *, then 
for any A € C, 

((A* - XI)ip, (A* - A I)if) = (if, (A - XI)(A* - A I)iP) 

= (V’, (A* — XI) (A — XI)ip) 

= ((A - XI)ip, (A - XI)ip) (10.24) 

Thus, if ip is an eigenvalue for A with eigenvalue A, if) is automatically 
an eigenvalue for A* with eigenvalue A. It then easily follows that ip is an 
eigenvector for p(A, A*) with eigenvalue p( A, A). 

In the other direction, suppose p is an eigenvalue for p{A, A*) and let W 
denote the ^i-eigenspace for p(A, A*). Since A and A* commute with each 
other, they also commute with p(A,A*). Thus, A and A* preserve W, as 
is easily verified, and the operator A\ w will have some eigenvector ip with 
eigenvalue A. Since Aip = Xip , then, as in (10.24), A *ip = Xip and so 

p(A,A*)ip = p(X,X)ip. 

Since also p(A,A*)ip = pip, by assumption, we have p = p( A, A), where A 
is an eigenvalue for A. ■ 

We now attempt to run the same argument for a bounded normal op¬ 
erator on H, replacing “eigenvector” with “almost eigenvector,” where ip 
is an e-almost eigenvector for ip if ||(A — XI)ip\\ is less than £||^||- The 
main difficulty with this approach is that for a given eigenvalue A, the set 
of s-almost eigenvectors is not a vector space. To surmount this difficulty, 
we will use the spectral theorem for the self-adjoint operator B*B, where 
B = p(A, A*) — pi, with p £ a{p{A, A*)). We will construct a spectral 
subspace W for B*B such that W is invariant under A and A* and such 
that each element of W is an e-almost eigenvector for p(A, A*) with eigen¬ 
value p. (Note, however, that we are not claiming that W contains all the 
£-almost eigenvectors for p(A, A*).) 

Definition 10.24 If A £ B( H), then an e-almost eigenvector for A 

with eigenvalue A € C is a nonzero vector ip £ H such that 

II(A — XI)ip\\ <e\\ip\\. 

We now establish three lemmas about almost eigenvectors, the last of 
which makes use of the spectral theorem for bounded self-adjoint operators. 
With these lemmas in hand, we will have a clear path to imitate the proof 
of the matrix case of Theorem 10.23. 
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Lemma 10.25 Suppose A £ B( H) is normal. 

1. If if is an e-almost eigenvector for A with eigenvalue X, then if is an 
e-almost eigenvector for A* with eigenvalue X. 

2. A number A € C belongs to cr(A) if and only if for all e > 0, there 
exists an e-almost eigenvector with eigenvalue A. 

Proof. Point 1 follows immediately from (10.24), which holds for bounded 
normal operators, not just matrices. For Point 2, suppose that an e-almost 
eigenvector with eigenvalue A exists for all e > 0. Then A — XI cannot have 
a bounded inverse, and so A £ a (A). In the other direction, if there is some 
e > 0 for which no e-almost eigenvector exists, then 

\\(A-XI)if\\>£\\if\\ (10.25) 

for all if £ H, showing that A — XI is injective. By (10.24), the same 
inequality hods with A —XI replaced by A* — XI. Thus, A* — XI is injective, 
so by Proposition 7.3, the range of A — XI is dense in H. Using (10.25) as 
in the proof of Proposition 7.7, it is easily seen that the range of A — XI is 
also closed, hence all of H. Thus, ( A — XI) is invertible and the inverse is 
bounded, by (10.25). ■ 

Lemma 10.26 Suppose A £ B( H) is normal. Then for each polynomial 
p in two variables and each number X £ C, there is a constant C such 
that if if is an e-almost eigenvector for A with eigenvalue A, then if is a 
(Ce)-almost eigenvector for p(A, A*) with eigenvalue p{X, A). 

Proof. We decompose p(A,A*) — p(X,X)I into a linear combination of 
terms of the form A k (A*) 1 — X k X l and we estimate such terms by induction 
on k + l. If k = 1 and 1 = 0, there is nothing to prove, and if k = 0 and 
l = 1, we use (10.24). Assume now that we have established the desired 
result for k + l = N and consider a case with k + l = N + 1. If Ac > 0, we 
write 

(. A k (A*) 1 - X k X l ) if = A^iA*) 1 (A - XI) if 

+ X(A k ~ l (A*) 1 -X k ~ l X l l)if. (10.26) 

Since if is an e-almost eigenvector and A and A* are bounded, the norm of 
the first term on the right-hand side of (10.26) is at most cie. By induction, 
the norm of the second term on the right-hand side of (10.26) is at most 
|A|c 2 e. Thus, the norm of the left-hand side of (10.26) is at most (ci + 

| A | C 2 )e - A similar analysis holds if k = 0, in which case l > 0. ■ 

Lemma 10.27 Let A £ B( H) be normal, let p be a polynomial in two 
variables, and let p be an element of the spectrum of p(A, A*). Then for 
all £ > 0, there exists a nonzero closed subspace W e of H such that W e is 
invariant under A and A* and such that every nonzero element of W e is 
an £-almost eigenvector for p(A, A*) with eigenvalue /i. 
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Proof. Fix some p in the spectrum of p(A, A*) and let B = p{A, A*) — pi. 
Then B is normal and 0 belongs to the spectrum of B. Using Point 2 of 
Lemma 10.25 and Lemma 10.26, we see that 0 belongs to the spectrum of 
the self-adjoint operator B*B. We apply the spectral theorem to B*B and 
we let W £ be the spectral subspace for B*B corresponding to the interval 
(—£ ,£ 2 ). By Proposition 7.15, W e is nonzero and invariant under B*B, 
and the restriction of B*B to W e has norm at most £ 2 . Thus, for all ip £ W e 
we have 


(B1>, Bi>) = <V>, B*Bip) < HV'II \\B*BiP\\ < e 2 \\ipf . 

Since B = p(A,A*) — pi, this shows that every nonzero element of W e 
is an e-almost eigenvector for p{A, A*) with eigenvalue p. Furthermore, A 
and A* commute with B*B and thus they preserve each spectral subspace 
of B*B (Proposition 7.16) including W e . m 

Proof of Theorem 10.23. Suppose first that A belongs to the spectrum of 
A. By Point 2 of Lemma 10.25, A has e-almost eigenvalues with eigenvalue 
A for every e > 0. Lemma 10.26 then shows that p(A,A*) has (Ce)-almost 
eigenvectors with eigenvalue p( A, A) for every e > 0, which shows that 
p( A, A) is in the spectrum of p{A, A*). 

In the other direction, suppose that p is in the spectrum of p(A,A*). 
For any e > 0, we consider the nonzero subspace W e in Lemma 10.27, 
which is invariant under A and A*. The restriction of A to W e is again a 
normal operator (Exercise 8), and A\ we has nonempty spectrum (Propo¬ 
sition 7.5). If we fix some A £ <r(A\ we ), Lemma 10.25 tells us that there 
exists an £-almost eigenvector ip for A in W e . By Lemma 10.26, ip is a (Ce)- 
almost eigenvector for p(A,A *) with eigenvalue p(A,A). Meanwhile, since 
ip £ W e , the same vector ip is also an e-almost eigenvector for p(A,A*) 
with eigenvalue p. It then is easy to see (Exercise 10) that 

\p-p{X,X)\ <Ce + £. (10.27) 

Since (10.27) holds for all £ > 0, we can find a sequence X n of points in 
<j(A) such that p{ A„,A„) p. Since cr(A) is compact, we can pass to a 
subsequence of the A„’s that is convergent to some A £ a (A), and this A 
will satisfy p( A, A) = p. m 

Combining Theorem 10.23 with the equality of the norm and spectral 
radius for normal operators (Proposition 10.21), we have the following re¬ 
sult. If A £ B(H) is normal and p is a polynomial in two variables, then 

\\p(A,A*)\\= sup |p(A, A)| . 

Aecr(A) 

The map p >->■ p(A, A*) has the property that p{A, A*) = ( p{A,A *))*, 
where the polynomial p is the complex-conjugate of p. In particular, if p 
takes only real values on cr(A), then p(A,A*) is self-adjoint. 


220 10. The Spectral Theorem for Unbounded Self-Adjoint Operators 

By the complex-valued version of the Stone-Weierstrass theorem (A.12), 
polynomials in A and A are dense in C(cr(A); C), the space of continuous 
complex-valued functions on er(A). Thus, the BLT theorem (Theorem A.36) 
tells that we can extend the map p <—> p(A,A*) to an isometric map of 
C(cr(A);C) into £?(H). This extension, which we call the continuous func¬ 
tional calculus for A, has all the same properties as in the self-adjoint case. 

Now that the continuous functional calculus for normal operators has 
been established, the proof of the spectral theorem—in any of its various 
versions—proceeds exactly as in the self-adjoint case. There is no need, 
then, to repeat the arguments given in Chap. 8. 


10.4 Proof of the Spectral Theorem for Unbounded 
Self-Adjoint Operators 

To prove the spectral theorem for an unbounded self-adjoint operator A, 
we will construct from A a certain unitary (and thus normal) operator 
U. We then apply the spectral theorem for bounded normal operators to 
U and translate this result into the desired result for A. To motivate the 
construction of U, consider the function 

C(x) := ^1, itR. (10.28) 

x — i 

It is a simple matter to check that C maps R injectively onto 5' 1 \{1}, with 
inverse given by 

D(u):=*h±l, ueSHfl}. (10.29) 

Furthermore, we have lim^^-too C(x) = 1. The function G(x) in (10.28) is 
the simplest bounded, injective function one can define on R. 

We wish to apply the map C to a self-adjoint operator A. If A is bounded 
and self-adjoint, it is straightforward to check that the operator (A+iI)(A— 
il)^ 1 is unitary (Exercise 5). Even in the unbounded case, it is possible to 
make sense of the operator U := C(A), and we can recover A from U, by 
(essentially) applying D. The operator U is unitary and is known as the 
Cayley transform of A. 

Recall that if A is self-adjoint, then i is in the resolvent set of A and the 
operator (A — */) _1 maps H into Dom(A). 

Theorem 10.28 (Cayley Transform) If A is a self-adjoint operator on 
H, let U be the operator defined by 

Utp = (A+ */)(A-*/)->. 


Then the following results hold. 
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1. The operator U is a unitary operator on H. 

2. The operator U — I is injective. 

3. The range of the operator U — I is equal to Dom(A) and for all if £ 
Range([/ — I) we have 

Aif = i{U + I)(U — -0“V- (10.30) 

According to Point 2, U — I is injective, while according to Point 3, the 
range of U — I is Dom(A). Thus, in (10.30), the expression (U — /) _1 refers 
to the inverse of the one-to-one and onto map U — I : H —► Dom(A). We 
are not claiming that 1 is in the resolvent set of U. That is to say, (17 — I) -1 
is not a bounded operator, unless Dom(A) = H, which occurs only if A is 
bounded. 

Proof. The resolvent operator {A — il)^ 1 must be injective, because 
(A — z/)(A — il)^ 1 if = if 

for all if £ H. Furthermore, (A — il ) _1 maps H onto Dom(A), because 
if = (A — i/) -1 (A — il)if 

for all if £ Dom(A). Since —i is also in the resolvent set of A, similar 
reasoning shows that A + il maps Dom(A) injectively onto H. Thus, U is 
the composition of one operator that maps H injectively onto Dom(A) and 
another operator that maps Dom(A) injectively onto H, so that U maps 
H injectively onto H. 

Now, for any </> £ Dom(A) we have 

((A + il)<f, (A + il)cf) = (Acf, Acf) + {<f, cf) 

= (( A-iI)4 >, ( A-iI)<f ), 

because of a familiar cancellation of cross terms. Thus, applying this with 
(f = (A — z/) -1 ^ shows that for any if £ H, we have 

((A + H)(A — z/)”V, (A + H)(A — i J) -1 z/j) 

= ((A — z/)(A — iI)~ x if, (A — iI)(A — iI)~ lr if) 

= (VbV’) • 

Thus, U is one-to-one and onto and preserves norms and is therefore 
unitary. 

For Point 2, observe that for any if £ H, we have 

(A + iI){A - il )" V = ((-4 - il) + 2 iI){A - z I)~ l ip 
= if + 2 z(A — il)~ x if. 


(10.31) 
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Thus, since (A — il) 1 is injective, we cannot have Uip = ip unless ip = 0. 
Finally, for Point 3, (10.31) says that 

U - I = 2i(A - il)- 1 , (10.32) 

which means (by the reasoning at the start of the proof) that the range of 
U — I is Dom(A). For ip £ Dom(A), we then have 

(U + I)(U - /)-V = ^(U + I)(A - il)ip 

= Yi K A + + ( A ~ ^ 

= t Arp, 

i 

which establishes Point 3. ■ 

We may apply the spectral theorem for bounded normal operators to 
associate a projection-valued measure p u to U. We will then transfer this 
measure from 5' 1 \{0} to R by means of the map D in (10.29) to obtain the 
desired projection-valued measure p A for A. 

Proposition 10.29 Let A be a self-adjoint operator on H, let U be the uni¬ 
tary operator in Theorem 10.28, and let D : 5' 1 \{0} —>■ R be as in (10.29). 
Then 

A = D(U), (10.33) 

where D(U ) is defined by the functional calculus for U. 

More precisely, D{U) = D( A) dp u ( A), where p u is the projection¬ 

valued measure associated to U by the spectral theorem for bounded normal 
operators. Note that by Point 2 of Theorem 10.28, 1 is not an eigenvalue for 
U and thus // r ({l}) = 0. Thus, D is an almost-everywhere-defined function 
on a(U), even if 1 £ <r(A). As always, the equality in (10.33) includes 
equality of domains, where the domain of D dfi u is the space Wd in 
Proposition 10.1. 

Proposition 10.29 should certainly be plausible in light of the previously 
established formula (10.30) for A in terms of U. 

Proof. Suppose E is a Borel subset of S' 1 \{0} such that the closure of E 
does not contain 1, and let Ve = Rang e(/j, u (E)) be the associated spectral 
subspace. Then the spectrum of U\ E is contained in E , which means that 
the functions u D(u) and l/(u—l) are bounded on <j{U\ v ). Now, 
by comparing the quadratic forms, we can see that D{U)\ V = D{XJ\ V ). 
Then by the multiplicativity of the functional calculus for U on bounded 
functions, we have 

D(U)ip = i(U + I)(U - J)"V 

for all ip £ Ve- Thus, by Point 3 of Theorem 10.28, D(U) agrees with A 
on Ve ■ 
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Meanwhile, if we decompose S' 1 \{0} as the disjoint union of sets E n 
for which E n does not contain 1, then H is the Hilbert space direct sum 
of the subspaces Ve„ ■ Now, A and (by Proposition 10.3) D(U ) are both 
self-adjoint. Furthermore, these operators agree on the finite direct sum 
of the Ve„’s and they are essentially self-adjoint on this finite sum, by 
Example 9.26. Thus, A and D(U) must be equal (with equality of domain). 


Theorem 10.30 Define a projection-valued measure p A on K by 


p A (E) = p u (C(E)). (10.34) 


Then 


A = 


A dp A ( A), 


(10.35) 


where p u is the projection-valued measure coming from the spectral theorem 
for the bounded normal operator U and C is the map defined in (10.28). 


Proof. If for any ip £ H, we define p^,(E) = (ip, pip) and similarly define 
H A , then we have 

p A (E)=p%(C{E)). 

By the abstract change of variables theorem from measure theory, we have 


f A 2 dp A ( A) = / 
Jr Js 


s'\{ o} 


d(u r 




(10.36) 


since D is the inverse map to C. Thus, the two operators in (10.35) have 
the same domain. Furthermore, if we replace A 2 by A and D(u ) 2 by D(u ) 
in (10.36), we see that the operators in (10.35) are also equal. ■ 

Proof of Theorem 10.4. The existence of the desired projection-valued 
measure p A is the content of Theorem 10.30. To establish uniqueness, sup¬ 
pose n A is a projection-valued measure on cr(A) such that f A du A ( A) = A. 
Consider then the operator C(A) as defined by integration of the function 
c(A) against v A . Arguing as in the proof of Proposition 10.29, we can see 
that C(A), computed in this fashion, coincides with the operator U = C(A) 
defined as the product of (A + il) and (A — ?J) _1 . 

Now define a projection-valued measure v u on S 1 by setting v u (E) = 
n A (C~ 1 (E)). Then as in the proof of Theorem 10.30, we have J gl u dv u 
(■ u ) = U. The uniqueness part of the spectral theorem for U (Theorem 10.20) 
then tells us that is u = p u , from which it follows that v A = p, A . m 

Proof of Theorem 10.9. By the direct-integral form of the spectral the¬ 
orem for U = C(A), there is a family of Hilbert spaces Ha, A € u(U) C S 1 , 
and a positive, real-valued measure p, on a{U) such that H is unitarily 
equivalent to Ha dp. , in such a way that the operator U corresponds to 
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the map s( A) >->• As(A). Since 1 is not an eigenvalue for U , either Hi = {0} 
or /z({l}) = 0. Either way, Hi is “negligible” in the direct integral. We can 
then define a family of Hilbert spaces K>, := H^m, for A £ cr(A) C R, and 
a measure v on a (A) given by v(E) = /i(C(E)). We may then form the 
direct integral f a( dv. This direct integral is unitarily equivalent in 

an obvious way to H>, dfj,. We wish to show, then, that dv 

is unitarily equivalent to H in such a way that the operator A corresponds 
to the (unbounded) operator mapping s(A) to As(A). Since the argument 
is similar to that in the proof of Theorem 10.4, we omit the details. 

As in the proof of Theorem 10.4, the uniqueness in Theorem 10.9 can 
be reduced to the uniqueness for the direct-integral form of the spectral 
theorem for U. ■ 

The proof of the multiplication operator form of the spectral theorem 
for unbounded operators is similar to the preceding proofs and is omitted. 


10.5 Exercises 

1. (a) If A is a bounded self-adjoint operator, show that U(t) := e lAt 

is continuous in the operator norm topology. 

(b) Using the spectral theorem, show that if A is a self-adjoint op¬ 
erator and a {A) is a bounded subset of R, then A is bounded. 

(c) Suppose A is a self-adjoint operator that is not bounded. Show 
that U[t) := e lAt is not continuous in the operator norm 
topology. 

Hint: Consider ip in a spectral subspace of the form V’m 0 _ £) ^ 0+£ ), 
where Ao is a point in cr(A) with |Ao| large. 

2. Let Pj be the unbounded self-adjoint operator defined in Sect. 9.8. 
Show that the one-parameter unitary group e ltPj generated by Pj is 
given by 

(e ltPj ip)(x.) = ip(x + thej) 

for all i/j £ L 2 (R"), where is the jth element of the standard basis 
for R". 

Hint: First determine the Fourier transform of e ltPj tp, using Propo¬ 
sition 9.32. 

3. If A is an unbounded self-adjoint operator on H, let us say that a 
family ip(t) of elements of H satisfies the equation 


dip 

dt 


iAip(t) 


(10.37) 
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in the strong sense if each ip(t) belongs to Dom(A) and 


lim 

h-tO 


ip(t + h) - 
h 


iAif>(t) 


= 0 


for every t G M. If we define by ip(t) = e ltA ipo , for some ipo £ H, 
show that ip(t) satisfies (10.37) in the strong sense if and only if ipo 
belongs to Dom(A). 

4. Suppose A is an unbounded self-adjoint operator and suppose that 
there exists a number 7 £ R. and a nonzero vector i)) £ Dom(A) such 
that 

\\M> -7^11 < £ IIV’II 

for some £ > 0. Show that there exists a number 7 in the spectrum 
of A such that |y — 7 ] < e. 

Hint: If no such 7 existed, the function /(A) := 1/|A — 7 | would 
satisfy |/(A)| < l/e for all A £ &(A). Consider, then, the operator 
f(A ), which is nothing but (Al — 7 /) -1 . 

5. If A is a bounded self-adjoint operator, show that the operator C(A) 
given by 

C{A) = (A + il^A-il)- 1 

is unitary and that 1 is in the resolvent set of C(A). Show also that 
A can be recovered from C(A) by the formula 

A = i(C{A) + I)(C(A)-I)~ 1 . 


6 . Show that Lemma 10.22 is false if we do not assume that A and B 
commute. 

7. Let A be a normal matrix and p a polynomial in two variables. Show 
by example that an eigenvector for p(A, A*) is not necessarily an 
eigenvector for A. 

Note: Nevertheless, the proof of the matrix case of Theorem 10.23 
shows that if p is an eigenvalue for p(A, A*), then there exists some 
eigenvector for p(A 7 A*) with eigenvalue p that is also an eigenvector 
for A. 

8 . Suppose A € 15(H) and IT is a closed subspace of H that is invariant 
under A and A*. 

(a) Show that (A\ w )* = A*\ w . 

(b) Show that if A is normal, the restriction of A to IT is normal. 
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9. (a) Suppose that H is finite dimensional, A is a normal operator on 

H, and W is a subspace of H that is invariant under A. Show 
that W is invariant under A*. 

(b) Show by example that the result of Part (a) is false if H is infinite 
dimensional. 

10. Given A € S(H), suppose that the same vector ip is an e-almost 
eigenvector for A with eigenvalue A and a <5-almost eigenvector for A 
with eigenvalue /i. Show that |A — fi\ < e + 5. 


11 

The Harmonic Oscillator 


11.1 The Role of the Harmonic Oscillator 

The harmonic oscillator is an important model for various reasons. In 
solid-state physics, for example, a crystal is modeled as a large number 
of coupled harmonic oscillators. Using the notion of “normal modes,” this 
model is then transformed into independent one-dimensional harmonic 
oscillators with different frequencies. In the quantum mechanical setting, 
the excitations of the different normal modes are called phonons. 

A free quantum held theory is similarly modeled as a family of cou¬ 
pled harmonic oscillators, except that in the held theory setting we have 
infinitely many of the oscillators. Even interacting quantum held theo¬ 
ries are often described using the harmonic oscillator raising and lowering 
operators, which are referred to as creation and annihilation operators in 
the context of held theory. 

Our approach to analyzing the harmonic oscillator also introduces the 
algebraic approach to quantum mechanics, in which algebra (commuta¬ 
tion relations between various operators) substantially replaces analysis 
(differential equations) as the way to solve quantum systems. Most of the 
effort in analyzing the harmonic oscillator occurs in the algebraic sec¬ 
tion (Sect. 11.2), with the remaining analytic issues being taken care of 
in Sects. 11.3 and 11.4. 
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11.2 The Algebraic Approach 

In this section we will derive as much information as possible about the 
Hamiltonian operator for a quantum harmonic oscillator using only the 
commutation relation between the position and momentum operators, 


[X, P) = ihl. 


( 11 . 1 ) 


Here, as usual, [-, ■] denotes the commutator, given by [A, B] = AB — BA. 
We consider, then, a harmonic oscillator with Hamiltonian given by 



( 11 . 2 ) 


where k is a positive constant. Our goal is to see what we can say about 
the eigenvectors and eigenvalues of H using only the fact that X and P are 
self-adjoint operators satisfying (11.1), without making use of the actual 
formulas for these operators. 

To be honest, we are actually assuming certain domain conditions regard¬ 
ing the operators X and P , in addition to the commutation relation (11.1), 
namely that the vectors ip n in Theorem 11.2 are actually in the domain of 
X and P (and thus, also, in the domain of the raising and lowering opera¬ 
tors). In this section, we follow the usual physics practice of assuming that 
all the vectors we work with are in the domain of all the relevant opera¬ 
tors. This assumption will turn out to be correct in the case we are actually 
considering, in which X and P are the usual position and momentum op¬ 
erators on L 2 (R). (See Sect. 11.4.) It is a more complicated matter to work 
out the domain conditions that must be imposed on two self-adjoint oper¬ 
ators satisfying (11.1) in order for the argument of the present section to 
be valid. We will come back to this issue in Chap. 14. 

Following, again, the convention in the physics literature, we now elimi¬ 
nate the spring constant k in favor of the frequency w = \Jk/m of the cor¬ 
responding classical harmonic oscillator. [Solutions to Hamilton’s equations 
with classical Hamiltonian H(x,p) equal to p 2 / (2to) + kx 2 /2 are sinusoidal 
with frequency -Jk/m .] Replacing k by mw 2 , we may rewrite (11-2) as 


H = — (P 2 + (mwl) 2 ) . 


(11.3) 


We now introduce the lowering operator a, given by 


mcoX + iP 


(11.4) 


and its adjoint a*, the raising operator,” given by 


mujX — iP 


(11.5) 


a 
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The reason for the terminology “raising” and “lowering” is that these 
operators raise and lower the eigenvalue for the Hamiltonian, as we will 
see shortly. In the context of quantum held theory, operators very much 
like a and a* are called creation operators and annihilation operators 1 re¬ 
spectively, because they map from the n-particle space to either the (ra+l)- 
particle space or the (n-l)-particle space, thus “creating” or “annihilating” 
a particle. 

In the world of noncommuting operators, (A — B)(A + B) does not equal 
A 2 — B 2 ] rather, 

(A — B)(A + B) = A 2 ~ B 2 + [A, B]. 


Thus, if we compute a*a using (11.1) we get 


* 

a a = 


1 

2 hmuj 

1 1 

Hut 2 to 


((muX) 2 + P 2 + imu [X, P}) 
(P 2 + ( muX ) 2 ) - \l. 


From this we obtain 

H = Hu (^a*a + -I 

The j 7 on the right-hand side of this expression should be thought of as a 
“quantum correction,” in that there would be no such term in the analogous 
formula for the classical Hamiltonian. 

It suffices to work out the spectral properties (eigenvectors and 
eigenvalues) of a*a. To get back to H , we keep the same eigenvectors and 
simply add 1/2 to the eigenvalues and then multiply by Hu. We compute 
that 


[a, a*] = —r ~— ([muX, —iP} + [iP , muX]) 

—.it/ll LLu 

= — - (Hmul + Hmul) 

2 Hmu 

= I. 


( 11 . 6 ) 


From this, it is easy to compute that 


[a, a* a] = a 

r * * l * 

[a , a a\ = —a . 


(11.7) 

( 11 . 8 ) 


Now, a*a is self-adjoint (or, at the least, symmetric) because ( a*a)* = 
a*a** = a*a. This operator is also non-negative, because 


(■ ip,a*aip) = (aip,aip ) > 0 


for all ip. We now come to a key computation, which demonstrates the 
utility of the operators a and a*. 
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Proposition 11.1 Suppose that if is an eigenvector for a*a with 
eigenvalue A. Then 


a* a(aif) = (A — 1 )aif 
a* a(a* if) = (A + 1 )a*if. 

Thus, either aij) is zero or aif is an eigenvector for a*a with eigenvalue 
A — 1. Similarly, either a* if is zero or a*if is an eigenvector for a*a with 
eigenvalue A +1. That is to say, the operators a* and a raise and lower the 
eigenvalues of a* a, respectively. 

Proof. Using the commutation relation (11.7), we find that 

a*a(aif) = ( a(a*a ) — a) if = (A — 1 )aif. 

A similar calculation applies to a*if, using (11.8). ■ 

If if is an eigenvector for a*a with eigenvalue A, then 

A (if, if) = (if, a* aif) = (aif, aif) > 0, 

which means that A > 0. Let us assume that a*a has at least one eigenvec¬ 
tor if, with eigenvalue A, which we expect since a*a is self-adjoint. Since 
a lowers the eigenvalue of a*a, if we apply a repeatedly to if, we must 
eventually get zero. After all, if a n if were always nonzero, these vectors 
would be, for large n, eigenvectors for a*a with negative eigenvalue, which 
we have seen is impossible. 

It follows that there exists some N > 0 such that a N if ^ 0 but a N+1 if= 0. 
If we define ifo by 

4’o := a N if, 

then aifo = 0, which means that a*aifo = 0. Thus, ifo is an eigenvector for 
a* a with eigenvalue 0. (It follows that the original eigenvalue A must have 
been equal to the non-negative integer N.) 

The conclusion is this: Provided that a*a has at least one eigenvector if, 
we can find a nonzero vector ifo such that 

aifo = a* aifo = 0. 

Since a*a cannot have negative eigenvalues, we may call ifo a “ground state” 
for a*a, that is, an eigenvector with lowest possible eigenvalue. We may then 
apply the raising operator a* repeatedly to ifo to obtain eigenvectors for 
a*a with positive eigenvalues. 

Theorem 11.2 Ififo is a unit vector with the property that aifo = 0, then 
the vectors 


ifn := (a*) n ifo, n > 0, 
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satisfy the following relations for all n 1 m> 0: 

a* ip n = ifn+l 
a*aip n = mf n 
(ipnilfrri) — n\5n,m 
aifn +1 = {n + 1 )VV 

Let us think for a moment about what this is saying. We have an orthog¬ 
onal “chain” of eigenvectors for a*a with eigenvalues 0 , 1 , 2 ,...., with the 
norm of if n equal to \fn\. The raising operator a* shifts us up the chain, 
while the lowering operator a shifts us down the chain (up to a constant). 
In particular, the “ground state” if o is annihilated by a. Thus, we have a 
complete understanding of how a and a* act on this chain of eigenvectors 
for a*a. 

Proof. The first result is the definition of ip n +i and the second follows 
from Proposition 11.1 and the fact that a*aip o = 0. For the third result, 
if n ^ m, we use the general result that eigenvectors for a self-adjoint 
operator (in our case, a*a) with distinct eigenvalues are orthogonal. (This 
result actually applies to operators that are only symmetric.) 

If n = m, we work by induction. For n = 0, (ifo^o) = 1 is assumed. If 
we assume (ipn,i/>n) = n\, we compute that 

(if n+1 ,if n +i) = (a*if n ,a*ip n ) 

= (lf n ,aa*lf n ) 

= (ifn, (a*a + 1 )if n ) 

= (n + 1 ) (ipm'ffn) 

= (n + 1 )!. 

Finally, we compute that 

aifn +1 = aa*ipn = (a*a + I) ip n = (n + 1 )if n , 

which establishes the last claimed result. ■ 

It is now reasonable to ask whether the vectors {if n }^L o form an 
orthonormal basis for the quantum Hilbert space. Suppose this is not the 
case. If we then let V denote the closed span of the ijj n ’ s, V will be invariant 
under both a and a*. Thus, by elementary linear algebra, the orthogonal 
complement V 1 - of V will also be invariant under the adjoint operators a* 
and a, and therefore also under a*a. Therefore, we can begin our analysis 
anew in V 1 ' , with the result that we will obtain a new ground state (f >o £ V~ L 
(satisfying acj >o = 0 ) that is orthogonal to the original ground state ifo- If, 
then, the closed span of the ifn’s is not the whole Hilbert space, there will 
exist at least two independent solutions of the equation aif = 0. To put this 
claim the other way around, if it turns out that there is only one solution 
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(up to a constant) of ai\> = 0, then we expect that the vectors obtained by 
applying a* repeatedly to the solution will form an orthogonal basis for our 
Hilbert space. (Because we are glossing over various technical issues having 
to do with the domains of various operators, this conclusion should not be 
regarded as completely rigorous.) 

11.3 The Analytic Approach 

In the preceding section, we analyzed the eigenvectors of the operator a* a 
as much as possible using only the commutation relation [a, a*] = I, which 
follows from the underlying commutation relation [X, P] = ihl. To progress 
further, we must recall the actual formula for the operators a and a*. 

To simplify our analysis, let us introduce the following natural scale of 
distance for our problem: 



We then introduce a normalized position variable, measured in units of D , 


x 


(11.9) 



so that 



A calculation gives the following simple expressions for the raising and 
lowering operators: 



( 11 . 10 ) 


Note that the constants m , u>, and H have conveniently disappeared from 
the formulas. 

Given the expression in (11.10), we can easily solve the (first-order, lin¬ 
ear) equation apo = 0 as 


Mx) = Ce- i2 /\ 


( 11 . 11 ) 


If we take C to be positive, then our normalization condition determines 
its value to be t/D, by Proposition A.22. (The normalization condition 


is that the integral of |po| 2 with respect to dx —not dx —should be 1.) We 
obtain, then, 







11.4 Domain Conditions and Completeness 233 


It remains only to apply a* repeatedly to ipo to get the “excited states” 
ip n - 


Theorem 11.3 The ground state ipo of the harmonic oscillator is given 
by (11.12). The excited states ip n are given by 

ipn = Hn if o (11.13) 

where H n is a polynomial of degree n given inductively by the formulas 

H 0 (x) = 1 

rr 1 /C_ TT dH n (x)\ 

H n + i(x) = -^= \ 2xH n [x) — J . 

Here, x is the normalized position variable given by (11.9). 


The polynomials H n are essentially (modulo various normalization con¬ 
ventions) the Hermite polynomials. 

Proof. When n = 0, (11.13) reduces to ip o = ipo- Assuming that (11.13) 
holds for some n, we compute ip n +i as 


ip n+ 1 = aif n = -±= (xH n (x)C e ~*' 2 


dx . 


H n (x)Ce 


-S 2 /2 


= -^= ^2 xH n (x) - Ce x2/2 = H n+1 (x)ip 0 (x), 


as claimed. ■ 

Figure 11.1 shows the ground state of the harmonic oscillator, along with 
the excited states with n = 5 and n = 30. Each eigenfunction is plotted as 
a function of the normalized position variable x. In each case, the shaded 
region indicates the extent of the classically allowed region, that is, the 
range in which a classical particle with energy E n can move. Note that 
each wave function decays rapidly outside the classically allowed region. 
In the last image, we can see that frequency of oscillation of the wave 
function is greatest in the middle of the classically allowed region, while the 
amplitude of the wave function is greatest near the ends of the classically 
allowed region. Intuitively, these properties of the wave function reflect that 
a classical particle with energy E n has largest momentum in the middle of 
the classically allowed region (where the potential is smallest) and that the 
classical particle spends more time at the ends of the classically allowed 
region, since it is moving slowest there. Further development of this sort of 
reasoning may be found in Chap. 15. 


11.4 Domain Conditions and Completeness 

Although the analysis in Sect. 11.2 is typical of what is found in physics 
texts, it is not completely rigorous from a mathematician’s point of view. 
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FIGURE 11.1. Harmonic oscillator eigenvectors with n = 0, n = 5, and n = 30. 
In each case, the classically allowed region is shaded. 


The main problem is that the lowing operator a, the raising operator a*. 
and the product operator a*a are all unbounded operators. The difficulty 
in working with unbounded operators is that one constantly has to check 
that a vector is in the domain of the relevant operator before applying that 
operator. For example, suppose we have a vector ipo in the domain of a and 
satisfying aip o = 0. We wish to apply the raising operator a* to ipo and we 
then want to argue that 


a* a(a* xp o) = a*ipo- 

This is easy enough to verify (as we did in the previous section) provided 
that all vectors are in the domain of the relevant operators. But how do 
we know that ipo is in the domain of a*? And even if it is, how do we know 
that a*ip o is in the domain of a* a? 
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These concerns are not just theoretical. Consider a general pair of 
operators A and B satisfying [ A, B\ = ihl. If we try to analyze an op¬ 
erator of the form aA 2 + (3B 2 , for a,/3 > 0, by the methods of Sect. 11.2, 
things can easily go awry, as the counterexample in Sect. 12.2 demonstrates. 
Fortunately, in the case of the ordinary position and momentum operators, 
the putative eigenfunctions %l> n for a*a in Theorem 11.3 are very nice func¬ 
tions, in the form of a polynomial times a Gaussian. Thus, there is no 
difficulty in verifying that these functions are in the domain of any finite 
product of creation and annihilation operators. It follows that if a and a* 
are given in terms of the usual position and momentum operators and ipo 
given by (11.12), the relations in Theorem 11.2 indeed hold. 

In particular, we can see that the ^> n ’s form an orthogonal set of functions 
in L 2 (R). Showing that they form an orthogonal basis is also not terribly 
difficult. 


Theorem 11.4 The functions 


fpnix) = H n (x)ip 0 (x) 
= H, 


Imuj \ Inmco f muj o') 

\I~tr x )\I~r exp \^K x ) 


form an orthogonal basis for the Hilbert space L 2 (R). 

The following result is the key to the proof. 

Lemma 11.5 For all a £ C, the partial sums of the series 


V —- e "* 2 / 2 

^ n ! 

n —0 


converge in L 2 (R) to the function e ax e x / 2 
Proof. We need to show that 

N ° 


e <xx e -x /2 _ J2 ■ 

n—0 


o-S/2 


d 


oo 

E 

n=N -\-1 


3 - 5 2 /2 


dx (11.14) 


tends to zero as N tends to infinity. The integrand on the right-hand side 
of (11.14) tends to zero pointwise. If we can find a suitable dominating 
function, we can use dominated convergence to conclude that the integral 
also tends to zero. We see that 


E 

n=N +1 


= —S 2 /2 


MS ... 

\n=0 

= g2|o| big-5 2 


\ ax \ c -a 2 /2 
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Since this last function certainly has finite integral, dominated convergence 
applies and we are done. ■ 

Proof of Theorem 11.4. It is easily seen that the raising and lower¬ 
ing operators map the Schwartz space 5(K) (Definition A.15) into itself. 
Furthermore, it is easy to verify (Exercise 1) that 



for all (p,ip £ iS(R). From this, we can easily verify that for all <p,ip £ »S (M), 

{<t>,aip) = {a*<f>, ip) 

and so also 

(c p,a*aij.b) = (a*a<p,ip) ■ 

It is evident that both the ground state ipo and all the excited states ip n 
occurring in Theorem 11.4 belong to <S(R). Thus, the proof of Theorem 11.2 
is indeed valid. We conclude, then, that the 'tp n 's form an orthogonal set of 
vectors in L 2 (M) and that they are eigenvectors for H with the indicated 
eigenvalues. 

It remains to show that the ip n ’ s form an orthogonal basis for L 2 (R). Let 
V denote the space of finite linear combinations of the ip^s. Since H n is a 
polynomial of degree n , it is easily seen that V consists precisely functions 
of the form 

ip{x) = p(x)e~ x / 2 , 

where p is a polynomial. 

Lemma 11.5 then shows that e lkx e~ x / 2 belongs to the L 2 -closure of V 
for all k € M. Thus, if ip is orthogonal to every element of V, we have 


„—ikx 

e e 


1 2 ip(x ) dx = 0 


(11.15) 


for all k. Now, since e x */ 2 belongs to L°°(M) (~l L 2 (M) and ip belongs to 
L 2 (K), their product belongs to L 2 (R) D L 1 (M). Thus, (11.15) tells us that 
the L 2 Fourier transform of e~ x / 2 ip(x) is identically zero. Thus, e~ x ! 2 ip(x ) 
must be the zero element of L 2 (R), by the Plancherel theorem, and so 
ip{x) = 0 almost everywhere. This shows that V 1 - = {0}, meaning that V 
is dense in L 2 (R). m 


11.5 Exercises 

1. Show that for any Schwartz functions (p and ip, we have 

{(p,aip) = (a* (p, ip ), 

as expected. 

Hint: Use integration by parts on the interval [— A , A] and show that 
the boundary terms tend to zero as A tends to infinity. 
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2. Show that the polynomials H n satisfy the following relations: 



and 



Hint: Start with the relation aip n = nif n - 1 . 

3. Establish the following Rodrigues formula for the polynomials H n : 



4. In this exercise, we prove the following claim: The polynomial H n has 
n distinct real zeros and the zeros of H n “interlace” with the zeros of 
H n _ i, meaning that there is exactly one zero of i? n _i between each 
pair of consecutive zeros of H n . 

(a) Verify the claim for Hi and H a . 

(b) Assume, inductively, that H n and H n _i have distinct real zeros 
and that the zeros interlace. Show that H n _i alternates in sign 
at consecutive zeros of H n . Then show that H n+ \ and H n _i have 
opposite signs at each zero of H n , so that H n+ 1 also alternates 
in sign at consecutive zeros of H n . Conclude that H n+ 1 must 
have at least one zero between each pair of consecutive zeros 
of H n . 

Hint: Use Exercise 2. 

(c) Show that H n+ 1 and H n _\ have the same sign near ±oo but 
opposite signs at the largest and smallest zeros of H n . Conclude 
that H n+ 1 has at least one zero below the smallest zero of H n 
and at least one zero above the largest zero of H n . 

(d) Conclude that H n+ 1 has n + 1 real zeros that interlace with the 
zeros of H n . 

5. Let i/) n = %l> n / 11 Tp n 11 be the normalized nth excited state. 

(a) Let X = X/D , where D = (h/moj) 1 ^ 2 . Show that 



Hint: Express X in terms of a and a*, using (11.10), and then 
use Theorem 11.2. 
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(b) Show that 


A&* = 


= 0 

h(n + 1/2) \ 1/2 


mui 


(c) If T and V denote the kinetic energy and potential energy terms, 
respectively, in (11.3), show that 


1 


(T)i =(V)i n+- . 


1 
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The Uncertainty Principle 


In this chapter, we will continue our investigation of the consequences of 
the commutation relations among the position and momentum operators. 
We will mostly consider a particle in R 1 , where we have 

[x,p\ = m. ( 12 . 1 ) 

We have already seen that much of the analysis of the Hamiltonian H 
for the quantum harmonic oscillator (given by ciP 2 + C 2 A' 2 ) can be car¬ 
ried out using only the commutation relation (12.1). There are two other 
main results that can be derived from these commutation relations: the 
Heisenberg uncertainty principle and the Stone-von Neumann theorem. 
The uncertainty principle states that the product of the uncertainty in X 
and the uncertainty in P cannot be smaller than h/ 2. The Stone-von Neu¬ 
mann theorem, meanwhile, states that any two self-adjoint operators A 
and B satisfying \A, B] = ihl “look like” several copies of the standard 
position and momentum operators acting on L 2 (R). Both results are true 
only under certain technical domain conditions, which we will need to ex¬ 
amine carefully. We discuss the uncertainty principle in this chapter and 
the Stone-von Neumann theorem in the next chapter. 

The uncertainty principle states that for all ip in L 2 (R) satisfying certain 
domain conditions, we have 

(A*X)(A*P) > —, 

where, for any observable A, we let A^A denote the “uncertainty” in mea¬ 
surements of A in the state ip (Definition 3.13). This means that one cannot 
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make both the uncertainty in position and the uncertainty in momentum 
arbitrarily small in the same state ij). 

Although we can easily make A, (;; A as small as we want simply be taking 
ij) to be supported in a small interval, if we do that, A^P will be large. 
Similarly, we can make A^P as small as we like, by taking the momentum 
wave function ij’(p) (Sect. 6.6) to be supported in a small interval, but 
then A,/,A will get large. In the idealized limit in which the position wave 
function is concentrated at a single point, ij)(x) would be a multiple of 
8{x — a) for some a, in which case, the momentum wave function 
would be a multiple of e _lpa / R . In that case, \ij)(p)\ 2 is constant, meaning 
that the momentum wave function is completely spread out over the whole 
real line. 

This uncertainty principle may be interpreted as saying that it is impos¬ 
sible to simultaneously measure the position and momentum of a quantum 
particle. After all, we have said (Axiom 4) that if we perform a measure¬ 
ment of an observable A with a discrete spectrum, then immediately after 
the measurement the state ij) of the system should be an eigenvector for A. 
If A has a continuous spectrum, this principle is replaced by the require¬ 
ment that after the measurement, the uncertainty in A should very small. 
If we could measure both the position and the momentum of the parti¬ 
cle simultaneously with arbitrary precision, then after the measurement, 
both AX and A P would have to be very small, violating the uncertainty 
principle. 

Now, on the scale of everyday life, Planck’s constant is very small. If, 
for example, we measure mass in units of grams, distance in units of cen¬ 
timeters, and time in units of seconds, then h has the numerical value of 
1.054 x 10~ 2 '. Thus, on “macroscopic” scales of energy and momentum, it 
is possible for the uncertainties in position and momentum both to be very 
small. But on the atomic scale, the uncertainty principle puts a substan¬ 
tial limitation on how localized the position and momentum of a particle 
can be. 

In Sect. 12.1, we prove a version of the uncertainty principle for any two 
operators A and B satisfying [A, B\ = ihl 1 under a seemingly innocuous 
assumption on the domains of the operators involved. In Sect. 12.2, how¬ 
ever, we see that the domain assumptions are not so innocuous after all. 
In that section, we encounter two operators satisfying [A, B ] = ihl on a 
dense subspace of the Hilbert space, along with a vector ij) such that the 
uncertainty in A is finite and the uncertainty in B is zero. The existence 
of such a vector is surely contrary to the spirit of the uncertainty princi¬ 
ple, even though it does not violate the version of the uncertainty principle 
proved in Sect. 12.1. (The vector ij) in Sect. 12.2 does not satisfy the domain 
assumptions of Theorem 12.4.) Finally, in Sect. 12.3, we show that for the 
usual position and momentum operators on L 2 (K), no such counterexam¬ 
ples occur: If A^X and A^P are both defined, then [A^X){A^P) > h/2. 
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12.1 Uncertainty Principle, First Version 

In this section, it is essential that we make sure that all vectors are in 
the domains of the various operators we want to apply to these vectors. 
With this concern in mind, we make the following definition. (Compare 
Definition 9.36.) 

Definition 12.1 If A and B are unbounded operators on H, define AB to 
be the operator with domain 

Dom(A.B) = {ip £ Dom(13) | Bip £ Dom(A)} 

and given by ( AB)ip = A{Bip). 

Even if Dom(A) and Dom(l?) are dense in H, it could happen that 
Dom(AI?) is not dense in H. 

Recall (Definition 3.13) that the uncertainty of a symmetric operator A 
in a state ip is defined to be 

(A .*A) 2 = ^(a-(A)^/) 2 ^ (12.2) 

As written, this definition requires that ip belong to the domain of (A — 
(A)^ I) 2 , which is the same as the domain of A 2 . However, since we assume 
that A is symmetric, then (A)^ = {ip,Aip) is real, so that A — {A)^I is 
again symmetric. Thus, (12.2) can be rewritten as 

(A^A) 2 = ((A - (A)^ I)ip, (A - (A)^ I)ip) . 

Having written the uncertainty in this way, it is natural to extend the 
definition of uncertainty to vectors that belong only to Dom(A), as follows. 

Definition 12.2 If A is a symmetric operator on H, then for all unit 
vectors ip in Dom(A), the uncertainty A^A of A in the state ip is given 
by 

(A^A) 2 = ((A - (A)^ I)ip, (A - (A)^ I)ip) . (12.3) 

By expanding out the right-hand side of (12.3), we see that the uncer¬ 
tainty may also be computed as 

(Av-A) 2 = {Aip, Aip) - {{ip, Aip)) 2 . 

[Compare (3.24).] Of course, if ip happens to be in the domain of A 2 , then 
Definition 12.2 agrees with (12.2). 

Proposition 12.3 If A is a symmetric operator on H, then for all unit 
vectors ip £ Dom(A), we have A^A = 0 if and only if ip is an eigenvector 
for A. 


242 


12. The Uncertainty Principle 


Proof. If A <pA = 0, then from (12.3), we see that (A — (A)^I)ip = 0, 
meaning that ip is an eigenvector for A with eigenvalue (A) ^ . Conversely, if 
Aip = A ip for some A, then (ip, Aip) = A (ip,ip) = A. Thus, (A—(A)^ I)ip = 0, 
which, by (12.3), means that A^A = 0. ■ 

As discussed in the introduction to this chapter, we expect that imme¬ 
diately after a measurement of an observable A, the state of the system 
will have very small uncertainty for A. Indeed, if A has discrete spectrum, 
we expect that the state of the system will be an eigenvector for A. Even 
in the case of a continuous spectrum, we expect that the uncertainty in 
A can be made as small as one wishes, by making more and more precise 
measurements. Suppose now that one wishes to observe simultaneously two 
(or more) different observables, represented by operators A and B. In the 
case of a discrete spectrum, the system after the measurement should be 
simultaneously an eigenvector for A and an eigenvector for B. In the case 
where A and B commute, this idea is reasonable. There is a version of 
the spectral theorem for commuting self-adjoint operators; in the case of 
discrete spectrum, it says that two commuting self-adjoint operators have 
an orthonormal basis of simultaneous eigenvectors with real eigenvalues. 
(In the case of unbounded operators, there are, as usual, technical domain 
conditions in defining what it means for two self-adjoint operators to com¬ 
mute.) 

In the case where A and B do not commute, they do not need to have any 
simultaneous eigenvectors. Certainly, A and B cannot have an orthonormal 
basis of simultaneous eigenvectors, or they would in fact commute. The lack 
of simultaneous eigenvectors suggests, then, that it is simply not possible 
to make a simultaneous measurement of two self-adjoint operators unless 
they commute. In standard physics terminology, the quantities A and B 
are said to be “incommensurable,” meaning not capable of being measured 
at the same time. (See Exercise 2 for a classification of the simultaneous 
eigenvectors of a representative pair of noncommuting operators.) 

In the case of a continuous spectrum, the notion of an eigenvector is 
replaced by the notion of a state with very small uncertainty for the relevant 
operator. In light of our discussion of simultaneous eigenvectors, we may 
expect that for noncommuting operators, it may be difficult to find states 
where the uncertainties of both operators are small. This expectation is 
realized in the following version of the uncertainty principle. 

Theorem 12.4 Suppose A and B are symmetric operators and ip is a unit 
vector belonging to Dom(AB) n Dom(llA). Then 


(A^Af(A^B) 2 > i 


2 


(lAB))^ 


(12.4) 


Note that if ip £ Dom(A13) then in particular, ip £ Dom(B), and if 
ip £ Dom(BA) then ip £ Dom(A). Thus, the assumptions on ip are sufficient 
to guarantee that A^A and A^B make sense as in Definition 12.2. 
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Proof. Define operators A' and B' by A! := A — {ip, Aip) I and B' := 
B — {ip, Bip) I. (We use the same domains for A! and B' as for A and 
B, and it is easily verified that A! and B' are still symmetric on those 
domains.) Then by the Cauchy-Schwarz inequality, we obtain 

{A'ip, A’ip) {B'ip, B'ip) > \{A'ip, B'ip)\ 2 (12.5) 

> \Im{A'iP,B'iP)\ 2 (12.6) 

= ~|(A>,f?V) -{B'iP, A'iP)f. (12.7) 

The assumptions on ip guarantee that Bip £ Dom(^4) and hence also that 
B'ip £ Dom( J 4'), and similarly with A' and B' reversed. Since A! and B' 
are symmetric, we may rewrite (12.7) as 

{A'ip, A'ip) {B'ip, B'ip) >±\{iP,A'B'iP)-{iP,B'A'iP)\ 2 

= \\{iP, [A',B']iP) | 2 . 

Now, since the identity operator commutes with everything, the commu¬ 
tator of A! and B' is the same as the commutator of A and B. Furthermore, 
{A'ip, A'ip) is nothing but (A ^A) 2 and similarly for B. Thus, we obtain 

{A^Af{A^Bf >1 \{iP,[A,B]iP)\ 2 , 

which is what we wanted to prove. ■ 

We now specialize Theorem 12.4 to the case in which the commutator is 
ihl and take the square root of both sides. 

Corollary 12.5 Suppose A and B are symmetric operators satisfying 

[A, B} = ihl 

on Dom(j4P) fl Dom(PA). Then if ip £ Dom{AB) fl Dom(B^4) is a unit 
vector, we have 

(A^A)(A^B) > (12.8) 

In particular, for all unit vectors ip £ L 2 (R) in Dom(XP) nDom(P.Y), we 
have 

(A^X)(A ^P) > (12.9) 

Note that the factor of h appearing on the right-hand side of (12.8) is re¬ 
ally just | {ip, [A, B]ip) |. Since, however, ip is a unit vector and [A, B] = ihl, 
ip drops out of the right-hand side of our inequality. We see then that both 
sides of (12.9) make sense whenever A^X and Amake sense, namely, 
whenever ip belongs to Dom(X) and to Dom(P). (Recall Definition 12.2.) 
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On the other hand, the proof that we have given for (12.9) requires if to 
be in both Dom(XP) and Dom(PX). Nevertheless, it is natural to ask 
whether (12.9) holds for all ip in Dom(X) (~l Dom(P). We may similarly 
ask whether (12.8) holds for all ip in Dom(A) D Dom(B). As we will see in 
Sects. 12.2 and 12.3, the answer to the first question is yes and the answer 
to the second question is no. 

Meanwhile, it is of interest to investigate “minimum uncertainty states,” 
that is, states ip for which the inequality (12.4) is an equality. 

Proposition 12.6 If A and B are symmetric and ip is a unit vector in 
Dom(AB) 0 Dom(BA), equality holds in (12. 4) if and only if one of the 
following holds: (1) ip is an eigenvector for A, (2) ip is an eigenvector for 
B, or (3) ip is an eigenvector for an operator of the form 

A — i^B 

for some nonzero real number 7 . 

In the case A = X and B = P, we will consider examples where equality 
holds in Sect. 12.4. 

Proof. To get equality in (12.4), we must have equality in both (12.5) 
and (12.6). Equality in (12.5) occurs if and only if A'ip = 0 or B'lp = 0 or 
A'ip = cB'ip for some nonzero constant c. If A'ip is zero, ip is an eigenvector 
for A with eigenvalue (A)^ . In that case, equality holds in (12.6) as well. 
Conversely, if ip is an eigenvector for A with some eigenvalue A, then (A)^ = 
A and A'lp = 0. Similarly, B'lp = 0 if and only if ip is an eigenvector for B. 

Meanwhile, suppose A'ip and B'lp are nonzero and A'ip = cB'ip, so that 
equality holds in (12.5). Then equality holds (12.6) if and only if c = *7 for 
some nonzero 7 € R. Thus, when A'ip and B'lp are nonzero, we get equality 
in (12.4) if and only if 

A'lp = i'yB'ip (12.10) 

for some nonzero real number 7 . Recalling the definition of A! and B', 
( 12 . 10 ) says that 

(A — (ip, Aip) I)ip = i^(B — (ip, Bip) I)ip (12.11) 

or 

(A-i'yB)ip = \ip, (12.12) 

where A = (ip, Aip) — i^ (ip, Bip). 

Thus, if (12.11) holds, ip is an eigenvector of A — i^/B. Conversely, if ip 
is an eigenvector for A — i^B with some eigenvalue A = c + id in C, then 

(c + id) \\ip\\ 2 = (ip, (A — i"/B)tp) = (ip, Aip) - ry (ip, Bip). (12.13) 

Since A and B are assumed to be symmetric and ip is a unit vector, we 
may equate real and imaginary parts in (12.13) to obtain 


c = (ip, Aip ); d = —7 (ip, Bip). 
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From this we can see that (12.11) and (12.10) hold, and thus equality holds 
in (12.4). ■ 


12.2 A Counterexample 

In this section, we consider the Hilbert space L 2 [— 1,1], As our “position” 
operator, we use the usual formula, 

Aip(x) = xip(x). 

Note that A is a bounded operator, because we restrict x to the bounded 
interval [—1,1]. As such, A is defined (and self-adjoint) on the whole Hilbert 
space L 2 (M). As our “momentum” operator, we again use the usual formula, 


As the domain of B we will take the space of continuously differentiable 
functions ip on [—1,1] satisfying the periodic boundary condition , 

^(-1) = V>(1). (12.14) 

To verify that B is symmetric, note that for any C 1 functions (p and ip, 
we have 

/ dx = A 1 )^! 1 ) - <M- 1 )V’(- 1 ) - j -~tp{x) dx. 

If both <p and ip satisfy the periodic boundary condition (12.14), the bound¬ 
ary terms cancel out to zero. This shows that the operator d/dx is skew- 
symmetric on Doni(U), from which it follows that — ihd/dx is symmetric 
on Dom(H). Actually, since the functions 

ip n {x) := A =e 7rmx , n € Z, (12.15) 

v 2 

constitute an orthonormal basis of eigenvectors for B with real eigenvalues, 
B is essentially self-adjoint, by Example 9.25. 

Now, for all ip £ Dom(AH) D Dom(HA) we have, by direct calculation, 

ABip - BAip = ihip, (12.16) 

just as for the usual position and momentum operators. Furthermore, 
Dom(AB) D Dom(HA) is dense in H, since it contains all continuously 
differentiable functions ip such that ip{ 0) = ip( 1) = 0. Consider, now, the 
function ip n [x ) in (12.15), for some integer n. Clearly, ip n is in the domain 
of B, since Bip n is just a multiple of ip n . Since ip n is an eigenvector for B, 
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the uncertainty of B in the state if n is zero! Meanwhile, since A is bounded, 
the uncertainty of A is well defined and finite. Thus, A ^ n A and A ^ n B are 
both unambiguously defined and 


(A^ n A)(A^ n B) = 0. 


(12.17) 


How can (12.17) hold? Is it not, in light of (12.16), a violation of (12.8) 
in Corollary 12.5? The answer is no, for the reason that if n does not satisfy 
the domain assumptions in that corollary. Specifically, Aip n is not in the 
domain of B, since Aip n is does not satisfy the periodic boundary condition 
in the definition of Dom(P). Thus, if n does not belong to Dom(PA). 

Although it does not contradict Corollary 12.5, (12.17) certainly violates 
the spirit of the uncertainty principle. In the next section, we will show 
that no such strange counterexamples occur for the usual position and 
momentum operators. 


12.3 Uncertainty Principle, Second Version 

In this section, we will see that if A and B are taken to be the usual 
position and momentum operators X and P, the uncertainty principle holds 
whenever A^A' and A^P are defined. We continue to use Definition 12.2 
for the definition of the uncertainty in any operator, in which case, for 
A^X and A^P to be defined, we require only that if belong to Dom(A) 
and Dom(P). 

We are now ready to formulate the strong version of the uncertainty 
principle. 

Theorem 12.7 Suppose if is a unit vector in L 2 (M.) belonging to Dom(A)n 
Dom(P). Then 


(A V ,A)(A V ,P) > -, 


(12.18) 


where A^X and A^P are given by Definition 12.2. 

Proof. According to Stone’s theorem and Example 10.16, the operator P 
is H times the infinitesimal generator of the group U{-) of translations. That 
is to say, for all if €= Dom(P), we have 


( Pif){x ) = —ih lim 


if{x + a) — if(x) 


a 
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where the limit is in the L 2 norm sense. Thus, 

(X*, P*) = lim (x*,-ih ( + 

a-iO \ \ a 

= lim ( — {xip{x), —ihijj(x + a)) + — {Xip, ip) 
a->o ya a 

= lim ( - {ih(y - a)ip{y - a), ip{y)) + — {Xip, ip) 
a->o \a a 

where in the last step we have made the change of variable y = x + a. 

If we rename the variable of integration back to x, we get 


{Xip, Pip) 


= lim I < ihX 

a—t 0 


ip(x — a) — 4’( x ) 


, ip(x) ) +ih (ip(x — a),ip(x)) 


= lim I < ih 

a—t 0 


ip(x — a) — 4’( x ) 


, Xip(x) ) + ih {ip(x — a),ip(x)) 


= {Pip, Xip) + ih {ip, ip) 


(12.19) 


In the second equality, we have used that X is symmetric and that (check) 
if ip £ Dom(X), then ip (x — a) € Dom(X) for each fixed a. In the last 
equality, we get a minus sign from having ip{x — a) — ip{x) rather than 
ip{x + a) — ip(x), and we use that translation is strongly continuous. 

It should be noted that (12.19) is precisely what we would get by formally 
moving X to the right-hand side of the inner product, using the commuta¬ 
tion relation XP — PX = ihl, and then moving P to the left-hand side of 
the inner product. But to make that calculation rigorous, we would need to 
assume that ip is in the domain of XP and the domain of PX. In (12.19), 
on the other hand, we have obtained the desired conclusion assuming only 
that ip is in the domain of X and in the domain of P. 

Having obtained (12.19), we can easily verify that for any real constants 
a and /?, we have 


{{X - al)ip, (.P - pi)ip) = {{P - pl)ip, {X - al)ip) + ih {ip, iP). (12.20) 
Solving (12.20) for {ip, ip) gives 

{iP, iP) = 1 (((X - al)ip, (P - (U)iP) - {{P - m, (X - aI)iP)) 

= \ Im ((X - al)ip, (P - /3I)ip) 
h 

<^\\{X~aI)ip\\\\{P~pi)ip\\, (12.21) 

by the Cauchy-Schwarz inequality. If ip is a unit vector and we take a = 
(X)^ , and /3 = {P)^ then ||(X - al)ip\\ 2 = (A^X) 2 and ||(P - /37)^|| 2 = 
(A ^P) 2 . Thus, we get 
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1 < ^(A^X)(A^P), 

which is equivalent to what we want to prove. ■ 

We know from Sect. 12.2 that the strong form of the uncertainty principle 
does not hold if X and P are replaced by two arbitrary operators satisfying 
AB — BA = ihl on Dom(AP)nDom(.BA), even if Dom(A13)nDom(.BA) is 
dense in H. Nevertheless, if we look carefully at the proof of Theorem 12.7, 
we can see what assumptions we would need on A and B to make the proof 
go through in a more general setting. 

Theorem 12.8 Suppose A and B are self-adjoint operators on H. Suppose 
that for all a € R and if £ Dom(A), we have that e laB if belongs to Dom(A) 
and that 

Ae iaB if = e iaB Aif - hae iaB if. (12.22) 

Then for all unit vectors if in Dom(A) n Dom(S), we have 

{A^A){A^B) > 1 

where A^A and A^B are defined by Definition 12.2. 

The relation 

e iaB A = Ae iaB + hae iaB , set, (12.23) 

which holds on Dom(A), is a “semi-exponentiated” form of the canonical 
commutation relations. As shown in Exercise 6 , there is a formal argument 
(ignoring domain issues) that the commutation relations [A, B] = ihl ought 
to imply the relations (12.22). Nevertheless, as Exercise 7 shows, this formal 
argument does not always give the correct conclusion. In Sect. 14.2, we 
will encounter a “fully exponentiated” form of the canonical commutation 
relations, in which both A and B are exponentiated. 

Proof. See Exercise 5. ■ 

Corollary 12.9 For any j = 1,... n and any unit vector if £ L 2 (R") with 
if £ Dom(Jf.,) D Dom(Pj), we have 


(Aif,Xj)(A^Pj) > ^ 

Proof. In the case that A = Xj and B = Pj , we have (e laB / h if)(x.) = 
if(x + aej), by Exercise 2 in Chap. 10. Thus, in this case, (12.22) says that 

(, Xj + a)if(x + aej) = Xjiffx + aej) + aif(x + aej), 


which is true. ■ 
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12.4 Minimum Uncertainty States 

In this section, we look at the states that give equality in the uncertainty 
principle. Such states are known as minimum uncertainty states or coher¬ 
ent states. As in the general setting of Proposition 12.6, the condition for 
a equality is an eigenvector condition. That is to say, even though in The¬ 
orem 12.7, we allow ip’ s that are not Dom(AP) D Dom(PA'), we do not 
get any new minimum uncertainty states by this weakening of our domain 
assumptions. 

Proposition 12.10 A unit vector ip G Dom(A') (~l Dom(P) satisfies 

(A^X)(A *P) = ^ 

if and only if if satisfies 

(A + i6P)ip — \ip (12.24) 

for some nonzero real number S and some complex number A. 

For convenience, we have made the substitution <5 = —7 in (12.24) rela¬ 
tive to Proposition 12.6. 


Relvd*)] 



= 1, (P) = 0, and 


AX = 1/2. 


Proof. All the relations in the proof of Theorem 12.7 are equalities, except 
for the inequality in the last line of (12.21). Equality will hold in that line 
if and only if one of ( X — al)ip and (P — /3I)i/-> is zero or (P — pl)ip is a 
pure-imaginary multiple of ( X — al)ip. Now, if ip is a unit vector in P 2 (M), 
then neither ip nor the Fourier transform of ip can be supported at a single 
point; thus, neither (A" — al)ip nor (P — f3I)ip can be zero. We are left, 
then, with the condition that 

(A - al)ip = i 7 (P - /3 I)ip, 


(12.25) 
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Relr (-v)J 



FIGURE 12.2. Minimum uncertainty state with (A') = 1, ( P ) = 10, and 
AA = 1/2. 


where 7 is a nonzero real number, a = (A and /3 = ( B. As in the 
proof of Proposition 12.6, (12.25) is equivalent to the assertion that ip is 
an eigenvector for the operator X — iyP. Letting 5 = —7 gives the desired 
result. ■ 

Proposition 12.11 If the parameter 8 in (12.24) negative, there are 
no nonzero solutions to (12.24). If the parameter S is positive, there exists 
a unique (up to multiplication by a constant) solution ips,X to (12.24) f or 
every complex number A. The function ips,\ has the following additional 
properties 

Re A 

7 Im A 
0 

5. 


Explicitly, we have 

4>5,\{x) = ci exp 
= c 2 exp 

where all expectation values are taken in the state ips,\- 

Note that among states with (AA)(A P) = h/2, we can arrange for 
AX/AP to be any positive real number, and once we have chosen AX/AP, 
we can then arrange for (X) and (P) to be any two real numbers. On the 


(x - A ) 2 
2 Sh 
(x-(x)y 
2SH 


exp 


*(P); 


(X) = 

(P) = 

AX _ 
~AP ~ 















12.5 Exercises 


251 


Rely/- (x)] 



x 


FIGURE 12.3. Minimum uncertainty state with (A'} = 1, (P) = 20, and AA' = 1. 

other hand, once AX/AP and ( X) and (P) have been specified, there is a 
unique quantum state with (AX)(AP) = h/2. In Figs. 12.1-12.3, we have 
plotted the real part of i/>S,\ for several different values of the parameters, 
in a system of units for which h = 1 . 

Proof. The equation (X + iSP)ip = \ip amounts to 



(12.26) 


where ij) is assumed to be in the domain of P, so that the distributional 
derivative of is an L 2 function. If ip were smooth, then the unique solu¬ 
tion to (12.26) would be the function ipgx given in the proposition, which 
is square-integrable if and only if 6 > 0. Even (12.26) is only assumed 
to hold in the distribution sense, the argument in the proof of Proposi¬ 
tion 9.29 (with e~ x ^ h i/{x) replaced by exp[(a; — X) 2 / (2Sh)\ip(x)) shows that 
there are no additional solutions. The formulas for (A'), (P), and AX/AP 
can be computed either by tracing through the arguments in the proof of 
Theorem 12.7 or by direct calculation with the formula for f/’M- ■ 

12.5 Exercises 

1. Let a be a positive real number. Show that the following “additive” 
version of the uncertainty principle holds for all unit vectors 1 / € 
Dorn(X) fl Dom(P) : 



a 


2. In this exercise, we classify the simultaneous eigenvectors of the non¬ 
commuting operators J\ and J 2 . Let J\, J 2 , and J 3 denote the angular 
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momentum operators on P 2 (R 3 ) as defined in Sect. 3.10. Suppose ip 
is in the domain of any product JjJk of two angular momentum op¬ 
erators. (For example, ip could be a Schwartz function.) Suppose also 
that ip is an eigenvector for J\ and for with eigenvalues a and /3, 
respectively. 

(a) Using the commutation relations in Exercise 10 in Chap. 3, show 
that ip is an eigenvector for J 3 with eigenvalue 0. 

(b) Show that the eigenvalues a and /? for J\ and J 2 must be zero. 

(c) What type of function ip £ P 2 (R 3 ) satisfies Jjip = 0 for j = 
1,2,3? 

3. Given any unit vector ip £ Dom(X) D Dom(P), consider another 
vector (p given by 

cP(x) = e ibx/h iP(x ~ a). 

Show that (p is a unit vector belonging to Dom(A") n Dom(P) and 
that 

(X)+ = (X)i, + a 
A^X = A^X 

and 

(P) <f ,= (P)^ + b 
A rf,P = Ai/jP. 

4. We have seen that a unit vector ip £ Dom(X)nDom(P) is a minimum 
uncertainty state [i.e., (A^A)(A^P) = h/2] if and only if there exists 
some S > 0 such that ip is an eigenvector of the operator X + iSP. 
In that case, ip is also an eigenvector for any operator of the form 
c(X + iSP), with c being a nonzero constant. Consider, then, some 
fixed 5 > 0 and define an operator a by the formula 

_ UX + iSP) 

a ' 

Then a is just the annihilation operator, as defined in Chap. 11, for a 
harmonic oscillator with mw = 1/5. Thus, a and its adjoint a* satisfy 
the relation [a, a*] = /, and we have the “chain” of eigenvectors 
ip n £ P 2 (R) satisfying the properties listed in Theorem 11.2. 

(a) For any A £ C, find constants c n so that the vector 

OO 

- = ^ ^ Cn'lpn 

n —0 

is an eigenvector for a with eigenvalue A. Show that the resulting 
series converges in H. 
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(b) Let (j>\ denote the eigenvector obtained in Part (a), normalized 
so that Co = 1. Show that 

<t>\ = e Aa 0o, 

where the exponential is defined by 

00 \n 

e Xa ’<t>o = E ->*)"&■ 

' n\ 

n —0 


with convergence in L 2 (R). 

5. Prove Theorem 12.8, following the outline of the proof of Theo¬ 
rem 12.7. Recall from Sect. 10.2 that B/h is the infinitesimal gen¬ 
erator of the one-parameter unitary group U{a) := e laB ^ h . 

6 . If X and Y are bounded operators, we may define adjv(50 = [X, Y], 
where [X, Y] = XY — YX. Thus, say, (adjv) 3 (d r ) = [X, [X , [X, F]]j. 
It is not hard to show that for any bounded operators Y and A", we 
have 

e x Ye~ x = e adx {Y) 

= y + l x,y 1 + !ml + + 

(12.27) 


(See Proposition 2.25 and Exercise 2.19 of [21].) 

Suppose A and B are unbounded self-adjoint operators satisfying 
[A, B] = ihl on Dom(AR) fl Dom(RA). Show that if we could ap¬ 
ply (12.27) with X = iaB/h and Y = A (even though X and Y are 
unbounded), then A and B would satisfy (12.22). 

7. Let A be the operator in Sect. 12.2, and let B be the unique self- 
adjoint extension of the operator B in that section. Show that the 
operators X = iaB/h and Y = A do not satisfy (12.27). 

Note : This result shows the hazards involved formally applying results 
for bounded operators to unbounded operators. 

Hint: Show that the unitary operators U(a) := exp (iaB/h) consist 
of “translation with wrap around,” first on the eigenvectors of B and 
then on the whole Hilbert space. 




13 

Quantization Schemes for Euclidean 
Space 


13.1 Ordering Ambiguities 

One of the axioms of quantum mechanics states, “To each real-valued 
function / on the classical phase space there is associated a self-adjoint 
operator / on the quantum Hilbert space.” The attentive reader will note 
that we have not, up to this point, given a general procedure for con¬ 
structing / from /. If we call / the quantization of /, then we have only 
discussed the quantizations of a few very special classical observables, such 
as position, momentum, and energy. 

Let us now think about what would go into quantizing a (more-or-less) 
general observable. Let us consider for simplicity a particle moving in M 1 
and let us assume that quantizations of x and p are the usual position 
and momentum operators X and P. What should the quantization of, say, 
xp be? Classically, xp and px are the same, but quantum mechanically, 
XP does not equal PX. Furthermore, neither XP nor PX is self-adjoint, 
because (XP)* = p*X* = PX , and PX ^ XP. In this case, then, a 
reasonable candidate for the quantization would be 

xp= XP + PX ). 

The significance of this simple example is that the failure of commuta¬ 
tivity among quantum operators creates an ambiguity in the quantization 
process. It does not make sense to simply “replace x by X and p by P 
everywhere in the formula,” since the ordering of position and momen¬ 
tum makes no difference on the classical side, but it does on the quantum 
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side. Up to this point, we have not really had to confront this ambiguity, 
because of the special form of the observables we have quantized. The 
Hamiltonian, for example, is typically of the form H{x,p) = p 2 /{2m) + 
V{x). Since each term contains only x or only p , it is natural to quantize 
H to H = P 2 / (2m) + V (X), where V{X) may be defined by the functional 
calculus or simply as multiplication by V (x). In defining the angular mo¬ 
mentum operators, we do encounter products of position and momentum, 
but never of the same component of position and momentum. For a parti¬ 
cle in R 2 , for example, we have, J = X 1 P 2 — X 2 Pi- On the quantum side, 
Xi commutes with P 2 and X 2 with P 2 , and thus there is no ambiguity: 
X\P 2 — X 2 P 1 is the same as P 2 X 1 — P 1 X 2 . 

When we turn to the quantization of a general observable, however, 
we must confront the ordering ambiguity directly. Groenewold’s theorem 
(Sect. 13.4) suggests that there is no single “perfect” quantization scheme. 
Nevertheless, there is one that is generally acknowledged as having the best 
properties, the Weyl quantization, and we spend most of our time with 
that particular scheme. Other quantization schemes do also play a role in 
physics, however; Wick-ordered quantization, notably, plays an important 
role in quantum field theory. (In quantum field theory, the replacement of 
certain Weyl-quantized operators with their Wick-quantized counterparts 
is interpreted as a type of renormalization.) 


13.2 Some Common Quantization Schemes 

In this section, we consider several of the most commonly used quantization 
schemes. For simplicity, we limit our attention to systems with one degree 
of freedom and to classical observables that are polynomials in x and p. 
(We consider the Weyl quantization in greater generality in Sect. 13.3.) 
Furthermore, we resolve in this section not to worry about domain questions 
and simply to use C£°(R) as the domain for all of our operators. Thus, 
in this section, equality of operators means equality as maps of C/° (M) to 
itself. It should be noted that the operators of the sort we will be considering 
may very well fail to be essentially self-adjoint, even if they are symmetric. 
Section 9.10 shows, for example, that the operator P 2 — cX 4 , for c > 
0, is not essentially self-adjoint on C£°(R). We follow the terminology of 
harmonic analysis by referring to a classical symbol / as the symbol of its 
quantization /. Once we have discussed each quantization scheme briefly, 
we will formalize the definitions of all the schemes in Definition 13.1. 

The simplest approach to quantization is to choose, once and for all, 
which to put first, the position or the momentum operators. We may, for 
example, choose to put the momentum operators to the right, acting first, 
and the position operators to the left, acting second. In this approach, a 
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polynomial in x and p will quantize to a differential operator in “standard 
form,” with all the derivatives acting first, followed by multiplication oper¬ 
ators. In harmonic analysis, there is a method for extending this quantiza¬ 
tion scheme to more-or-less arbitrary symbols, /. For a general (nonpoly¬ 
nomial) symbol /, the resulting operator / is known as a pseudodifferential 
operator. 

A serious drawback of the pseudodifferential quantization is that even 
when the symbol / is real-valued, the operator / it produces is typically 
not self-adjoint (or even symmetric). If, for example, f(x,p) = xp , then the 
associated operator is XP, the adjoint of which is PX, which is not equal 
to XP. The simplest way to fix this problem is to symmetrize the operator 
by taking half the sum of the operator and its adjoint. 

The Weyl quantization, meanwhile, takes more seriously the possibility 
of different orderings of X and P , by considering all possible orderings. 
Thus, in quantizing, say, x 2 p 2 , the Weyl quantization will give 



For a general monomial, the Weyl quantization similarly averages all the 
possible orderings of the position and momentum operators. 

For Wick-ordered and anti-Wick-ordered quantization, we no longer 
regard the position and momentum operators as the “basic” operators, 
but rather the creation and annihilation operators. Specifically, given any 
positive real number a , we introduce complex coordinates on the classical 
phase space by 


z = x — zap 
z = x + iap. 


(13.1) 


(Although it would seem more natural to define z to be x + iap, this 
choice would lead to problems later, especially with the Segal- Bargmann 
transform.) We then consider the corresponding quantum operators, which 
we call the raising and lowering operators: 


a* = X - iaP 
a = X + iaP. 


(13.2) 


In comparing these operators to the ones defined in the context of the 
harmonic oscillator, we should think of a as corresponding to l/(mu). 
Even with this identification, however, the operators in (13.2) differ by a 
constant from the raising and lowering operators of Chap. 11. [The over¬ 
all normalization of the raising and lowering operators is not important 
in this context, provided that we are consistent in the normalization be¬ 
tween (13.1) and (13.2).] In particular, the commutator of a and a* is not 
/ but rather 2 ahl. 
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In Wick-ordered quantization, we begin by expressing the classical 
observable / in terms of z and z rather than in terms of x and p. When we 
quantize, we put all the lowering operators (coming from the factors of z 
in /) to the right, acting first, and the raising operators (coming from the 
factors of z in /) to the left, acting second. This approach to quantization is 
useful in quantum field theory, where letting the lowering operators act first 
can cause certain otherwise ill-defined expressions to become well defined. 
In anti-Wick-ordered quantization, we do the reverse, putting the raising 
operators to the right, acting first. Although anti-Wick-ordered quantiza¬ 
tion seems singular in the context of quantum field theory, in systems with 
finitely many degrees of freedom, it is actually better behaved than Wick- 
ordered quantization. 

Definition 13.1 Define several different quantization schemes for symbols 
that are polynomials in x and p as follows. Each scheme is uniquely 
determined—as a map from polynomials on R 2 into operators on Cf° (R) — 
by the indicated formulas. 

1. Pseudodifferential operator quantization: 


Q(x j p k ) = X: j p k . 


2. Symmetrized pseudodifferential operator quantization: 


Q(x j p k ) = ^(X j P k + P k X j ). 


3. Weyl quantization: 



where for any operators Ai, A 2 ,..., A n and any a € S n , we define 


Ct(Ai,A 2 , . . . , A n ) — A 0 .( 1 )A 0 .( 2 ) • • • Ar(n)- (13.3) 


f. Wick-ordered quantization with parameter a: 


Q((x + iapfi(x — iap) k ) = (X — iaP) k (X + iaP)\ a > 0. 


5. Anti-Wick-ordered quantization with parameter a: 


Q{(x + iapfi (x — iap) k ) = (X + iaPfi {X — iaP ) k , a > 0. 


In applications, the most useful quantization schemes are the Wick- 
ordered, anti-Wick-ordered, and Weyl schemes. All of the quantization 
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schemes in Definition 13.1 except the pseudodifferential operator quantiza¬ 
tion have the property of mapping real-valued polynomials to symmetric 
operators on CP°(R). (See Exercise 3 in the case of the Wick- and anti- 
Wick-ordered quantizations.) 

In comparing the different quantization schemes, it is important to rec¬ 
ognize that two different expressions may describe the same operator. We 
may calculate, for example, that 

i(XP 2 + P 2 X) = ^(PXP + [X,P]P + PXP-P[X,P]) 

= PXP, 

since [X, P] is a multiple of the identity and thus commutes with P. As a 
result, we can eliminate the PXP term in the Weyl quantization of xp 2 , 
with the result that 

Qweyi(®p 2 ) = ~(XP 2 + PXP + P 2 X) = 1 -{XP 2 + P 2 X), (13.4) 

which coincides, in this very special case, with the symmetrized pseudod¬ 
ifferential quantization of xp 2 . 

Example 13.2 If f{x,p) = x 2 , then the Weyl, Wick-ordered and anti- 
Wick-ordered quantizations of f are as follows: 

Qweyl(x 2 ) = X 2 

Qwick(x 2 ) = X 2 - l^ahl 

Qanti—Wick(^ ) = A ~ T —OiHI. 

Proof. The value for Qwey\{x 2 ) is apparent. To compute the Wick- and 
anti-Wick-ordered quantizations, we first write x as (z + z)/2 , so that 

2 (z + z) 2 1 . 2 „ _ _ 2 x 

x 2 = -—= -(z 2 + 2zz + z 2 ). 

4 4 

Thus, we have, for example, 

Qwick(£ 2 ) = y {(X - iaP) 2 + 2(X - iaP)(X + iaP) + (X + iaP) 2 ) . 

When we expand this expression out, the P 2 terms cancel, and the XP 
and PX terms from (X — iaP ) 2 will cancel with the XP and P X terms 
from (X + iaP) 2 . Thus, we will be left with A " 2 terms and the XP and 
PX terms from the cross-term above: 

QwickOr 2 ) = \ (4X 2 + 2*a[X,P]) . 
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Using the commutation relation between X and P gives the desired result. 
The calculation of Q a ntiWick(£ 2 ) is identical except that the order of the 
factors in the cross-term is reversed, which gives the opposite sign for the 
[.X , P] term. ■ 

Proposition 13.3 The Weyl quantization—viewed as a linear map of the 
space of polynomials on R 2 into operators on C%° (R) —is uniquely charac¬ 
terized by the following identity: 

<3weyi((aa; + bp) j ) = (aX + bP) J (13.5) 

for all non-negative integers j and all a, b £ C. 

Proof. The Weyl quantization is easily seen to satisfy the identity 

Qweyi(Oi:r + bip) ■ ■ ■ (ajx + bjp)) 

= 4 J2 a ( aiX + 6iP > ■ ■ ■■ a -i x + ( 13 - 6 ) 

for all sequences aq,.... a 3 and b\...., b 3 of complex numbers, where the 
expression cr(-,is defined by (13.3). Specializing to the case where all 
the a/s are equal to a and all the bj' s are equal to b gives (13.5). Conversely, 
suppose that Q is any linear map of polynomials into operators on C')? 0 (R) 
satisfying Q((ax + bp)i) = (aX + bPy for all a, b, and j. For each j, let 
Vj denote the space of homogeneous polynomials / of degree j such that 
Q(f ) = Qweyi(/)- Then Vj contains all polynomials of the form (ax + bpy , 
and thus, by Exercise 1, Vj consists of all homogeneous polynomials of 
degree j, so that Q = Qweyi- ■ 

Proposition 13.4 The Weyl quantization satisfies 


iH / 

^t?Wey— ^Weyl(*£)^?Weyl(<7) "^"^Weyl ( 

dg\ 

dp) 

(13.7) 

( 

— QWeyl(*?)QWeyl(^) "^"Qweyl ( 

dg\ 

dp) 

(13.8) 

iH ( dc) 

Qweyl(p^) = Qwey\(p)Qweyl(9) “1“ ^"Qweyl ( q ^ 

) 

(13.9) 

— Qweyl(<7)Qweyl(.P) “^"Qweyl ^ ~Qx 

) 

(13.10) 


for all polynomials g in x and p. 

It should be noted that the formulas for the Weyl quantization in Propo¬ 
sition 13.4 may not give the same “expression” for Qweyi(/) as does 
Definition 13.1, but it does give the same operator. [Compare (13.4).] 
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Proof. Suppose A = (a\X + b\P) and B = (a 2 X + 6 2 P). Then [A, B] is a 
multiple of /, from which we can easily verify that 

AB j = B k ABi- k + k[A, B]B j ~ 1 , 

for 0 < k < j. If we sum this relation over k and divide by j + 1, we obtain 
AB j = — V B k AB 3 ~ k + + H M B]B 3 ~ 1 . (13.11) 

7 + 1 ^ 7 + 1 2 1 ’ J V ’ 

■' k=0 J 

Now, A is the Weyl quantization of (a\X + b\p) and B :i is the Weyl quanti¬ 
zation of (a, 2 X + b 2 py , and both terms on the right-hand side of (13.11) are 
easily recognized as Weyl quantizations. Thus, after rearranging the terms 
and evaluating the commutator, (13.11) becomes, 

<3weyi((ai:r + b 1 p)(a 2 x + b 2 p) 3 ) 

= Qweyi(aiZ + b 1 p)Q Wey i((a 2 x + b 2 p) 3 ) 

- ih J -{aib 2 - a 2 b 1 )Q Wey i((a 1 x + hp) 0-1 ). (13.12) 

Meanwhile, if we run the same argument starting with B 3 A we obtain a 
similar result: 

Qwey\{{aix + bip)(a 2 x + b 2 p) 3 ) 

= Qwey\((a 2 X + b2p) 3 )Qwey\(aiX + bip) 

+ irf-{a-ib2 - a 2 6i)Qweyi((aia: + bip) 3 ^ 1 ). (13.13) 

If we specialize to the case (ai, bi) = (1,0) and (a 2 , 62 ) = (a, b), we get 

Qwey\{x(ax + bp) 3 ) = Q Wey i(x)Q Wey \{(ax + bp) 3 ) 

- ih^bQ Wey \((ax + 6p) J_1 ), (13.14) 

where the last term on the right-hand side of (13.14) is —ih/2 times the 
Weyl quantization of d(ax + bp) 3 /dp. Thus, (13.14) is precisely (13.7) in the 
case g(x,p) = (ax + bp) 3 . We can then see from Exercise 1 that (13.7) hold 
for all polynomials g. The proofs of (13.8), (13.9), and (13.10) are similar. 


13.3 The Weyl Quantization for M 2n 

In this section, we study the Weyl quantization on a much larger class of 
symbols (i.e., classical observables) than the polynomial symbols considered 
in the previous section. We also generalize from symbols defined on M 2 to 
symbols defined on M 2ra . 
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13.3.1 Heuristics 


It is a straightforward matter to extent the Weyl quantization on 
polynomials from R 2 to M 2rl . This extended quantization will satisfy 

Qweyi((a P + b pY) = (a ■ X + b • P y (13.15) 

for all a, b G R” and all non-negative integers j, as in Proposition 13.3 in 
the n = 1 case. Suppose we wish to extend Qweyi to certain nonpolynomial 
symbols, starting with complex exponentials. If we multiply (13.15) by 
(i) 3 /j\ and sum on j, we would expect to have 

Qweyl (e i(ax+bp) ) = e <(a- x +b-P)_ ( 13 . 16 ) 

Now, if / is any sufficiently nice function on R 2n , we can expand / as an 
integral involving functions of the form exp(i(a • x + b • p)), by using the 
Fourier transform: 

/(x, p) = (2 tt)-" [ /(a, b)e i ( a ' x + b - p ) da db, 

J R 2 ™ 

where / is the Fourier transform of /. In light of (13.16), it is then natural 
to define 

<2weyi(/) = (27r) -rl [ /(a. b)e l(a X+b P) da db. (13.17) 

J R 2 ™ 


Before proceeding, let us pause for a moment to compute the operator 
exp(i(a • X + b • P)). If A and B are bounded operators that commute with 
their commutator (i.e., such that [ A , [71,1?]] = [ B , \A 1 B]] = 0), then 


e A+B =e -[A,B]/2 e A e B_ 


(13.18) 


(See Theorem 14.1, which is proved in Sect. 3.1 of [21]. Equation (13.18) is 
a special case of the Baker-Campbcll Hausdorff Formula.) If we formally 
apply (13.18) with A = i& ■ X and B = ib ■ P (even though these are 
unbounded operators), we obtain 


el (a X+b.P) _ e ifi(a.b)/2 e ia.X e ib P 


(13.19) 


Meanwhile, by Example 10.16 in Sect. 10.2, we know that 
(e lbP 7/>)(x) = ^(x + ftb). 


Thus, we may reasonably hope that 

^(a-X+b-P) A = e ift(a.b)/2 ela .x^ + ^ (13.20) 

In general, we get incorrect results if we formally apply results for bounded 
operators to operators that are unbounded. In this case, however, the result 
of the formal calculation is correct. The simplest way to prove this is to 
replace a and b by ta and th on the right-hand side of (13.19) and to check 
that the result is a strongly continuous one-parameter unitary group. 
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Proposition 13.5 For all a and b in R n , the operators U a .b{t) on L 2 (R”) 
given by 

(tfa,b(i)^)(x) = e it2fi ( a ' b )/ 2 e ita x ^ ( x + thh) (13.21) 

form a strongly continuous one-parameter unitary group. The infinitesimal 
generator of this group coincides with a • X + b ■ P on C*(K") and is 
essentially self-adjoint on this domain. Thus, if a-X + b- P denotes the 
unique self-adjoint extension of the infinitesimal generator on C)) 0 ^"), it 
follows from Stone’s theorem that 

^it (a-X+b-P) git 2 ft(a-b)/2^ifbP 


for all t El. In particular, (13.19) and (13.20) hold. 

Proof. It is apparent that Z7 a .b is unitary for each a and b, and it is a 
simple direct computation to show that it is indeed a unitary group. Strong 
continuity is proved in the usual way using a dense subspace, as in the proof 
of Example 10.12. When is in it is easy to differentiate the right- 

hand side of (13.21) with respect to t at t = 0 to obtain the formula for the 
infinitesimal generator. Finally, the essential self-adjointness of a • X + b • P 
on (^(R") is precisely the content of Proposition 9.40. ■ 

With the computation of the operator e*( a ' x+b ' p ) in hand, we return to 
our analysis of the proposed formula (13.17) for the general Weyl quan¬ 
tization. If the Fourier transform of / is in L 1 (R 2n ), we can regard the 
right-hand side of (13.17) as an absolutely convergent “Bochner” integral 
with values in the Banach space B(H). For our purposes, however, it is 
more convenient to think of operators on L 2 (M") as integral operators and 
to write down a formula for the integral kernel of Qweyi(/) hr terms of / 
itself. (But see Exercise 7.) 

At a formal level, the operator mapping if to e lS ( a b )/ 2 e * a x 1/ / ) ( x + hh) 
may be thought of as an “integral” operator, with integral kernel given by 

e ift(a.b)/2 e ia.x^ x + hh _ (13.22) 


where S n is an n-dimensional delta-function (the n-dimensional analog of 
the distribution in Example A.26). Thus, it should be possible to obtain the 
integral kernel of Qweyi(f) by integrating the preceding expression against 
/(a, b). To evaluate the resulting integral, we make the change of variable 
c = hh, in which case we obtain 



= hr n (fiTx)- n i‘ 1 


e *(a-^)/ 2 gta-x^^ x _|_ c _ y)/(a, c/H) dc da 

e i(a-(y—x))/2 e ia.x^( aj ( y _ x )/fc) da 

(27t) - "/ 2 [ e ia '( x +y)/ 2 /(a, (y - x)/ft) da 


JR n 


(13.23) 
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We may recognize the integral in square brackets in the last line of (13.23) 
as undoing the Fourier transform of / in the x-variable, leaving us with the 
partial Fourier transform of / in the p variable, evaluated at the points (x+ 
y)/2, (y — x)/H. (The partial Fourier transform means the ordinary Fourier 
transform with respect to one of the variables, with the other variable 
fixed.) Thus, we expect that Qweyi(/) should be the integral operator with 
integral kernel n / given by 

K/(x, y) = (27rft)—" [ /((X + y)/2, p)^"*^/* dp. (13.24) 
JR n 

13.3.2 The L 2 Theory 

With the preceding calculations as motivation, we now define Qweyi(f) to 
be the integral operator with kernel Kf, beginning with the case in which 
/ belongs to L 2 (K 2n ). The resulting operators will turn out to be Hilbert- 
Schmidt operators on L 2 (R n ). 

If H is a Hilbert space and A £ Z?(H) is a non-negative self-adjoint 
operator on H, then it can be shown that A has a well-defined (but possibly 
infinite) trace. What this means is that the value of 

trace(H) := ^ (ej,Aej) 

3 

is the same for each orthonormal basis {ej} of H. Note that since A is a 
non-negative operator, (ej, Aej) is a non-negative real number, so that the 
sum is always defined, but may have the value + 00 . 

Now, if A is any bounded operator, then A*A is self-adjoint and non¬ 
negative. We say that A is Hilbert-Schmidt if 

trac e(A* A) < 00 . 

Given two Hilbert-Schmidt operators A and B , it can be shown that A* B 
is a trace-class operator, meaning that the sum 

OO 

trace(H*H) := ^ (ej, A*Bej) 

3 =1 

is absolutely convergent and the value of the sum is independent of the 
choice of orthonormal basis. We define the Hilbert-Schmidt inner product 
of A and B and the associated Hilbert-Schmidt norm of A by 

(A, B) hs := trac e(A*B) 

IHI HS := Vtrace(A*H). 

It can be shown that the space of Hilbert-Schmidt operators on H forms a 
Hilbert space with respect to the Hilbert-Schmidt inner product. 
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(See Sect. 19.2 for more details.) We denote the space of Hilbert-Schmidt 
operators on H by HS(H). 

We will make use of the following standard (and elementary) result 
characterizing Hilbert-Schmidt operators on L 2 (R n ) in terms of integral 
operators. (See, for example, Theorem VI.23 in Volume I of [34].) 

Proposition 13.6 If n is in L 2 (R" x R”) then for every if £ L 2 (R n ), the 
integral 

A K (ip)(x) := [ K(x,y)V>(y) dy (13.25) 

J R" 

is absolutely convergent for almost every x £ R”, and A K (ip) also belongs 
to L 2 (R n ). Furthermore, the operator A K is a Hilbert-Schmidt operator on 
L 2 (R n ) and 

II^kIIhs = II k IIl 2 (R"xR“) • 

Conversely, for any Hilbert-Schmidt operator A on L 2 ( R n ), there exists 
a unique k £ L 2 (R" x R ra ) such that A = A K . 

We are now ready, using discussion in Sect. 13.3.1 as motivation, to define 
the Weyl quantization of L 2 symbols. 

Definition 13.7 For all f £ L 2 ( R 2n ), define nj : R 2n —> C by 

«/(x,y) = (2 t Th)~ n [ /((x + y)/2, p)e- l (y” x )'P/ R dp, (13.26) 

JR" 

and define the Weyl quantization of f, as an operator on L 2 (R”), by 

Qweyl(f') — A K f, 
where A Kf is defined by (13.25). 

The integral in (13.26) is not necessarily absolutely convergent, and 
should be understood as computing a partial Fourier transform. Thus, we 
should, strictly speaking, replace the right-hand side of (13.26) with 

lim (2Trh)~ n [ /((x + y)/2,p)e- i(y - x) ' p/R dp, (13.27) 

d|p|<_R 

where the limit is in the norm topology of L 2 (R 2 "). [The partial Fourier 
transform maps the Schwartz space 5(R 2ra ) to itself. By Fubini’s theorem 
and the Plancherel formula for R n , the partial Fourier transform is an L 2 - 
isometry and extends to a unitary map of L 2 ( R 2rl ) to itself. This unitary 
map can be computed by the usual formula on functions in L 1 D L 2 and 
can be computed by the limiting formula similar to (13.27) in general.] 

In words, we may describe the procedure for computing Kf at a point 
(x^x 2 ) in R 2 ” as follows. First, compute the partial Fourier transform F p 
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of /(x, p) in the p-variable, resulting in the function (J r p /)(x, £). Then 
evaluate J- p f at the point x = (x 1 + x 2 )/2, £ = (x 2 — x 1 )/h. Finally, 
multiply the result by fi _rl (27r) _rl / 2 to get 

K /( x \ x2 ) = h~ n (2Tr)~ n/2 (J- p /)((x 1 +x 2 )/2, (x 2 - x^/fi). (13.28) 

Theorem 13.8 The map Q weyi is a constant multiple of a unitary map 
of L 2 (M . 2n ) onto HS(L 2 (K")). The inverse map Q ^ ey i : HS(L 2 (R n )) —> 
L 2 (M. 2n ) is given by 

Qwey\( A )(^ p) = h n [ /c(x - fib/2, x + fib/2)e lb ' p (lb, 

J R" 

where k is the integral kernel of A as in Proposition 13.6. 

Furthermore, for all f € L 2 (M. 2n ), we have Qwey\(f) = Qweyi(/)*> i n 
particular, Qweyi(/) is self-adjoint if f is real valued. 

Properly speaking, the integral in the theorem should be understood 
as an L 2 limit, as in (13.27). The fact that Qweyi is unitary (up to a con¬ 
stant) tells us that for an appropriate constant c, the operators ce d a X + b p ) 
form an “orthonormal basis in the continuous sense” for the Hilbert space 
HS(L 2 (R”)). (Compare Sect. 6.6.) 

It is possible, using the same formulas, to extend the notion of Weyl 
quantization to symbols belonging the space of tempered distributions, 
that is, the space of continuous linear functionals on iS(R 2rl ). We will not, 
however, develop this construction here. See [11] for more information. 
Proof. Proposition 13.6 gives a unitary identification of HS(L 2 (R")) with 
L 2 (K n x R"). Thus, it suffices to show that the map / i—► Kf is a multiple 
of a unitary map. This result holds because the partial Fourier transform 
is a unitary map of L 2 (R 2n ) to itself and composition with an invertible 
linear map is a constant multiple of a unitary map. The inverse of the map 
f i —y ttf is obtained by inverting the linear map and undoing the partial 
Fourier transform. Finally, it is apparent from (13.26) that 

«/( x , y) = «/(y, x )- 

This, along with Exercise 6, shows that Qweyi(f) = Qweyi(f)*- * 

13.3.3 The Composition Formula 

If / and g are L 2 functions on K 2n , then Qweyi(f) and QweyiG?) are Hilbert- 
Schmidt operators, in which case their product is again Hilbert-Schmidt. 
(Indeed, the product of a Hilbert-Schmidt operator and a bounded operator 
is always Hilbert-Schmidt.) Thus, since Qweyi is a bijection of L 2 (M 2n ) with 
HS(L 2 (R")), there is a unique L 2 function, which we denote by f*g, such 
that 


Qweyl(/)Qweyl(i?) — Q\Vey\{f * *?)• 


( 13 . 29 ) 
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(Of course, the operator *, like the Weyl quantization itself, depends on h, 
but we suppress this dependence in the notation.) 

Proposition 13.9 The Moyal product f -kg may he characterized in terms 
of the Fourier transform as 

(f*g){ a,b) = (2 n)~ n Jj e -«Ka-b'-b-a')/2 

x /(a — a', b — b , )g(a', b') da' db', 
where both integrals are over K”. 

Note that if we set h = 0 in the above formula, f kg reduces to ( 2ir)~ n 
times the convolution of / and g, which is nothing but the Fourier transform 
of fg. It is thus not difficult to show (Exercise 10) that 


lim fkg 
ft— 0+ 


fg- 


That is to say, the Moyal product fkg is a “deformation” of the ordinary 
pointwise product of functions on K 2n . More generally, the Moyal product 
can be expanded in an asymptotic expansion in powers of h 1 as explained 
in Sect. 2.3 of [11]. This expansion terminates in the case that / and g are 
both polynomials. 

Proof. It is, of course, possible to obtain this formula using kernel func¬ 
tions. It is, however, easier to work with the (13.17), which can be shown 
(Exercise 7) to give the same result as Definition 13.7 when / is a Schwartz 
function. We assume standard properties of the Bochner integral for func¬ 
tions with values in a Banach space [in our case, 15(H)], which are similar 
to those of the Lebesgue integral. (See, for example, Sect. V.5 of [46].) 

We have, then, 


Qweyi{f)Qwey\{g) = (2tt) n JJ /(a, b)e l(a X+b p) da db 

x (2Tr)~ n Jjg{ a', b') e i(a ' x+b ' p) da' db'. (13.30) 


Now, it is an easy calculation to verify, using Proposition 13.5, that 

e i(a-X+b.P) e i(a'-X+b'.P) _ e -ifi(a.b'-b-a')/2 e *((a+a')-X+(b+b')-P) ^3 3 -^ 

which is what one obtains by formally applying the special case of the 
Baker-Campbell-Hausdorff formula in (13.18). Thus, we may combine the 
integrals in (13.30) to obtain 

Qweyl(/)Qweyl(3) = (27 r)~ 2n JI'll e -«K-b'-b-a')/ 2 e i((a+a')-X+(b+b')-P) 
x /(a, b)g(a', b') da db da 1 db'. 
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By introducing new variables c = a + a' and d = b + b' in the a and b 
integrals and reversing the order of integration, we obtain, after simplifying 
the exponent, 


Q Weyl ( f )Q Wey 1 (<? ) 

= (2 n)~ n Jf [( 2 t T)~ n Jj e ~Mcb'-d - & ')/2 

x /( c — a', d — b , )g(a / , b') da' db'} e *( c - x + dp ) dc dd. 


From this and (13.17), we see that Qweyiif)Qweyi(g) is the Weyl quanti¬ 
zation of the function whose Fourier transform is the quantity in square 
brackets above, which is what we wanted to show. ■ 


Proposition 13.10 The Moyal product f -kg extends to a continuous map 
of L 2 (M. 2n ) x L 2 (M. 2n ) into L 2 (EL 2n ) and the composition formula (13.29) 
holds for all f and g in L 2 (M. 2n ). 

Proof. A standard inequality asserts that for any two Hilbert-Schnridt 
operators A and B 1 we have 

Piling < MIIhsH^Hhs- 

It follows that the product map (A, B) H > AB is a continuous map of 
HS(L 2 (K")) x HS(L 2 (K")) to HS(L 2 (R n )). Meanwhile, the Weyl quantiza¬ 
tion is a constant multiple of a unitary map from L 2 (R 2rl ) to HS(L 2 (K ra )). 
For Schwartz functions / and g, the Moyal product is nothing but 

f *9 = Qwlyl( ( 3weyl(/)Qweyl(3))- (13.32) 

The right-hand side of (13.32) provides the desired continuous extension of 
f k g. Clearly, the composition formula (13.29) holds for this extension. ■ 


13.3.4 Commutation Relations 

In quantum mechanics, the commutator of two operators (divided by iH) 
plays a role similar to that of the Poisson bracket in classical mechanics. 
Thus, we may naturally ask: To what extent does the Weyl quantization 
(or any other quantization scheme) map Poisson brackets to commutators? 
The short answer is: Not always. Indeed, as we will see in Sect. 13.4, no 
“reasonable” quantization scheme can give an exact correspondence be¬ 
tween {/, g} on the classical side and [A, B]/(ih) on the quantum side. 
Nevertheless, such an exact correspondence does hold for various special 
classes of symbols. If we consider, for example, the class of symbols that 
depend only on x and not on p, then on the classical side, all such functions 
Poisson commute. The Weyl quantization maps such functions /(x) to the 
operator of multiplication by /(x), and thus the quantizations of any two 
such functions commute. A more interesting (in particular, noncommuta- 
tive) example is as follows. 
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Proposition 13.11 Suppose f is a polynomial in x and p of degree at 
most 2 and g is an arbitrary polynomial in x and p. Then 

77' [Qweyl(/), Qweyl(g)] = Qweyl({/, ff}), (13.33) 

where {f,g} is the Poisson bracket of f and g. 

Here, we define the Weyl quantization by the obvious n-variable exten¬ 
sion of Definition 13.1, and we regard all operators as operating simply 
on C^R"). See Exercise 8 for another class of symbols on which (13.33) 
holds. Although the requirement that g be a polynomial can be relaxed, 
we will not attempt to obtain the optimal version of the result. 

Proof. For notational simplicity, we abbreviate Qweyi(f) to Q(f) for the 
duration of the proof. If / has degree zero, then both sides of the desired 
equality are zero. Turning to case in which / has degree 1, we use the n- 
variable extension of Proposition 13.4, the proof of which is essentially the 
same as the 1-variable result. The result is as follows: 

Q{x 3 g) = Q(xj)Q(g) - y Q (j^j 
= Q(g)Q( XjH ^Q^y 

By subtracting these two formulas and rearranging, we get 

\ Q&lQig)} = Q (||) = Q({ Xj ,g}). 

A very similar argument establishes the desired result when / = p : j and 
thus for all homogeneous polynomials of degree 1. 

Suppose now that /i and /2 are homogeneous polynomials of degree 
1 in x and p. Then it follows easily from Proposition 13.4 that for any 
polynomial h, we have 

Q(fjh) = 7 ( Q(fj)Q(h ) + Q(h)Q(fj)), j = 1,2. (13.34) 

In particular, we have 

Q(/i/ 2 ) = \(Q(fi)Q(f2) + Q(/ 2 )Q(/i)). (13.35) 

Using (13.35) and the product rule for commutators (Proposition 3.15), we 
have 

QU\h).Q>n)\ 

= 2 ^([Q(/i), Q(s)]Q(/ 2 ) + Q(/i)[Q(/ 2 ), Q(g)] 

+ [Q(/ 2 ), Q(ff)]Q(/i) + Q(/ 2 )[Q(/i), Q(g)]). 
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Using the degree-1 case of the result we are trying to prove, along with 
(13.34), we get 

[Q(/i/ 2 )> Q(g)} = l(Q({fi,g})Q(h) + Q(fi)Q({f 2 ,g}) 

+ Q({/ 2 , g})Q(h) + Q(/ 2 )Q({/i, ff})) 

= Q(f2{h,g}) + Q(fi{f 2 ,g}) 

= Q({fif 2 ,g}), (13.36) 

where in the last equality we have used the product rule for the Poisson 
bracket. We have now established the desired result when / is a homoge¬ 
neous polynomial of degree 0, 1, or 2. ■ 

At first glance, it appears that one could extend the result to the case 
where / has degree 3, by considering three homogenous polynomials /i, / 2 , 
and /3 of degree 1 and symmetrizing as in (13.35). The argument breaks 
down, however, because the Q(fj )’s do not commute. The Q(fj)’s will n °t 
always occur in the correct order to allow us to pull the fj ’s back inside the 
Weyl quantization, the way we did in (13.36) in the degree-2 case. Indeed, 
an elementary but tedious calculations shows that 

Tj:\Qweyl{x 2 p), Qweyl(xp 2 )] = 3 X 2 P 2 - QiHXP - H 2 1 , 

whereas 

Qweyl({x 2 p,xp 2 }) = 3 X 2 P 2 — 6ihX P — - h 2 I , 

so that the two expressions differ by h 2 1/2. 

We conclude this section with a brief glimpse of an important “equivari- 
ance” property of the Weyl quantization. Note that the Poisson bracket of 
two real valued homogeneous polynomials of degree 2 is again real valued 
and homogeneous of degree 2. The space of real homogeneous polynomials 
of degree 2 thus forms a Lie algebra (Sect. 16.3) with respect to the Poisson 
bracket. This Lie algebra is naturally isomorphic to the Lie algebra sp(n; R) 
of Lie group Sp(n; R), the real symplectic group. This group is the group of 
invertible linear transformations that preserve a skew-symmetric form on 
R 2ra . See Chap. 16 for information about Lie groups and their Lie algebras. 

If we apply Proposition 13.11 in the case in which both / and g are 
homogeneous of degree 2, we see that the map 7r(/) := Qweyi(f) is a repre¬ 
sentation of sp(n;R) in the space of skew-symmetric operators on L 2 (R"). 
It can be shown that associated to this representation of sp(n; R) there is 
a projective unitary representation II of the group Sp(n;R), known as the 
metaplectic representation. (See, again, Chap. 16 for definitions.) Proposi¬ 
tion 13.11 is the infinitesimal version of the following equivariance property 
of the Weyl quantization: For all A £ Sp(n;R) and all / £ L 2 (R 2n ), we 
have 


Qwey\(f ° A : ) = n(A)Q W eyl(/)n(A) U 
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See Theorem 2.15 and Chap. 4 of [11] [where our 11(A) corresponds to 
/r((A*) -1 ) in Folland’s notation] for this result and much more about the 
metaplectic representation. 


13.4 The “No Go” Theorem of Groenewold 

In Sect. 13.3.4, we noted that the Weyl quantization on polynomials satisfies 

^r[Qweyl(/), Qweyl(ff)] = <9weyl({/, S'}), (13.37) 

provided that / is a polynomial of degree 2, but not in general. One might 
think that the failure of (13.37) represents a shortcoming in the definition 
of the Weyl quantization, which could be remedied by an alternative defini¬ 
tion. In this section, however, we will see that no quantization scheme that 
maps Xj and pj to the usual position and momentum operators Xj and Pj 
can satisfy (13.37) for general polynomials in x and p. This sort of nonex¬ 
istence result, of a construct satisfying seemingly natural and desirable 
conditions, is referred to in the physics literature as a “no go” theorem. 

In light of this result, one might think that perhaps the position and 
momentum operators should be defined differently, possibly with an ac¬ 
companying change in the choice of the quantum Hilbert space. Indeed, 
there is a map Q that satisfies (13.37) for all / and g, namely the pre¬ 
quantization map described in Sect. 23.3. The prequantization map accom¬ 
plishes this feat by drastically enlarging the quantum Hilbert space, from 
L 2 (K”) to L 2 (R 2ra ). The Hilbert space L 2 (R 2n ) is considered to be “too 
big” from a physical standpoint, which explains why the map Q is only 
“prequantization” rather than “quantization.” (The prequantization map 
has a number of other undesirable features that are described in Sect. 23.3.) 
If one imposes a natural “smallness” assumption on the quantum Hilbert 
space (irreducibility under the action of the position and momentum op¬ 
erators), then the Stone-von Neumann theorem will tell us that (modulo 
certain technical domain assumptions) any choice of position and momen¬ 
tum operators satisfying the canonical commutation relations is unitarily 
equivalent to the usual ones. 

The upshot of the discussion in the two preceding paragraphs is that 
there is no physically reasonable quantization scheme that satisfies (13.37) 
for all (polynomial) functions / and g. 

We turn, now, to Groenewold’s “no go” theorem. We need to make 
domain assumptions, so that it makes sense to compute the commuta¬ 
tors of the quantized operators. The simplest approach is to assume that 
the quantization Q(f) of any polynomial / will be in the algebra gener¬ 
ated by the X ’s and P’s, and thus that Q(f) will be a differential operator 
with polynomial coefficients. There is a variant of this result, known as van 
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Hove’s theorem, that proves a similar “no go” result under a more gen¬ 
eral assumption about the form of the quantized operators. See [15] for a 
rigorous proof of van Hove’s theorem. 

Definition 13.12 For any k > 0, let Vk denote the space of homogeneous 
polynomials of degree k and let V<k denote the space of all polynomials of 
degree at most k. 

Theorem 13.13 (Groenewold’s Theorem) LetVfW 1 ) denote the space 
of differential operators on R" with polynomial coefficients. There does not 
exist a linear map Q : P <4 —> T> (R n ) with the following properties. 

1. Q(l) = I- 

2 . Q(xj) = Xj and Q(pj ) = Pj. 

3. For all f and g in V< 3 , we have 

Q({f,g}) = l i [Q(f),Q(9)}- (13.38) 

Note that in Property 3 of the theorem, we assume that / and g belong 
to P <3 rather than V< 4 . This assumption guarantees that {/,<?} belongs 
to V< 4 , so that the left-hand side of (13.38) is defined. 

Our strategy in proving Groenewold’s theorem is the following. We know 
(Proposition 13.11) that the Weyl quantization satisfies (13.38) if / has 
degree at most 2 and g has degree at most 3. Using this result, we can 
show that any map Q satisfying the properties in Theorem 13.13 must 
coincide with the Weyl quantization on V< 3 - We then identify a polynomial 
f £ V 4 that can be expressed as a Poisson bracket in two different ways, 
/ = {g,hj = {g',h'j, with g , h , g', and h' in V 3 . Upon calculating that 
[Qweyi(s), Qweyi(/i)] does not coincide with [Qweyi(s'): Qwey\(h')}, we will 
have a contradiction. 

The proof will consist of several lemmas, followed by the coup de grace. 
Lemma 13.14 Consider an element A ofDfSJ 1 ) expressed as 

k v 7 

where k ranges over multi-indices, where the /k ’s are polynomials, and 
where only finitely many of the /k ’s are nonzero. Then A is the zero oper¬ 
ator on only if each of the /k ’s is zero. 

Proof. For each multi-index k, let |k| = k\ + • • • + k n . Suppose not all 
the /k’s are zero, let N be the smallest non-negative integer for which /k 
is nonzero for some k with |k| = N , and let ko be some multi-index with 
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|ko| = N and /k 0 ^ 0. Let us apply A to a function g that is equal, in a 
neighborhood of the origin, to x ko . Then all the terms in Ag other than 
the /k 0 term will be zero in a neighborhood of the origin, whereas the /k 0 
term will be a nonzero constant in a neighborhood of the origin. Thus, A 
is not the zero operator. ■ 

Lemma 13.15 If A belongs to T>(R n ) and A commutes with Xj and Pj 
for all j = 1,..., n, then A = cl for some c£C. 


Proof. We may easily prove by induction that 

k / ry \ k— 1 


d 


dx„ 


d 


dx , 


— (xjg(x)) = k — s(x) + Xj 5 (x) 


d 


d Xj 


for any polynomial g. Thus, for any multi-index k, we have 


1 1 


,X, 


= %/M [ ^ 


k—e. 


(13.39) 


Suppose A is a nonzero element of 2?(R ra ) that commutes with each Xj. 
If deg(v4) = M, consider a nonzero term in A of degree M: 

( d \ k ° 

/ k °( x ) ( ) , M = M, / ko ^ 0. 


If M > 0, we can pick some j such that the jth entry of ko is nonzero. 
By (13.39) and our assumption on A, we have 


0 = [A, Xj] 


(k 0 )j/ ko (x) 


/^_\ ko_e3 ' 

W 


+ other terms, 


where the other terms involve multi-indices of the form k — e^, with k ^ ko. 
Thus, by Lemma 13.14, [A. Xj] is not the zero operator. 

We see, then, that any A £ V(R n ) that commutes with each Xj must be 
of degree zero; that is, A must simply be multiplication by some polynomial 
/(x). If, in addition, A commutes with each Pj , then 

0 = [/(x),Pj] = 

Thus, actually, / must be constant and A is a multiple of the identity 
operator. ■ 


Lemma 13.16 For any f £ V 2 , there exist gi, ■ ■ ■ ,gj and h\,...,hj in P 2 
such that 

f = {dh hi} 4-+ {gj,hj}. 

Furthermore, for any f £ V 3 , there exist elements g},..., g' k of V 3 and 
h [,..., h' k of V 2 such that 

f = Wi > K \ 4-b Wki h'k}- 
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Proof. See Exercise 12. ■ 

Lemma 13.17 If Q satisfies the conditions in Theorem 13.13, then Q 
coincides with Q weyi on P< 3 . 

Proof. Our argument leans heavily on Proposition 13.11. Note that, by 
assumption, Q coincides with Qweyi on V<\. For f € V 2 , let us write 
Q(f) as 

Q(f) = Qweyl(/) + Af. 

For any g £ P<i, we have, by (13.38) and Proposition 13.11, 

Q({f,9}) = ^[Q(f),Q(9)} 

= 3^[Qweyl(/)> Qweyl(ff)] + [Af , Qweyl (#)] 

= Qweyl({/, d}) + ~rr[Af,Q Wey i(g)} 

= Q({f,g}) + QweyKfl)], (13.40) 

since {f,g} £ V<i- Thus, [Af, Qweyi(ff)] = 0 for every g £ V\, and so, by 
Lemma 13.15, we must have Af = Cfl for some constant c/. 

Now, if h is in P 2 , we have, by the just-established result and Proposi¬ 
tion 13.11, 


— [Qweyl(/) + 0/1, Qweyl(^) T C^/] 

[Qweylif), Qweyl{h)} 

= Qweyl({/) h}). (13.41) 

That is to say, Q and Qweyi agree on elements of V2 of the form {/, h}, for 
f,h£V2- Thus, by Lemma 13.16, Q and Qweyi agree on all of V2, and so 
on all of V< 2 . 

We now use the V <2 case of the lemma to establish the V 3 case. Given / £ 
P. 3 , we write Q(f) = Qweyi{f) + B f . Given g £ V<\, we have {f,g} £ P< 2 - 
Thus, we may argue as in (13.40), applying the just-established P < 2 case of 
the lemma to {/, g} in the last step. The conclusion is that [ Bf,Q(g)\ = 0 
for all / £ V <2 and thus, by Lemma 13.15, that Bf = dfl for some constant 
df. Meanwhile, if h £ V 2 , we argue as in (13.41), but with c/ replaced by 
df and with Ch now known to be zero. The conclusion is that Q agrees with 
Qweyi for all elements of P 3 of the form {/, h} with / £ P 3 and h £ P 2 , 
and thus, by Lemma 13.16, for all elements of P 3 . ■ 
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Proof of Theorem 13.13. Assume, toward a contradiction, that a map Q 
as in the theorem exists. Let / be the polynomial given by 

/(x,p) = x\p\. 

We observe that / can be written in two different ways as a Poisson bracket: 

x iPi = = \{x{pi,Xip\}. 

Thus, by Lemma 13.17, we must have 

g [Qweylpl ) j Qweyl(Pl )] = 

= glQweylplPl), QweylplPl)]- 

On the other hand, if we apply both commutators to the constant func¬ 
tion 1 (or to a function equal to 1 in a neighborhood of the origin), we 
obtain 

^[Oweyl(*?),Qweyl(p?)]l = P f - P?X*)1 

= 1 - 

Meanwhile, if we compute the quantizations as in (13.4) and then drop all 
terms involving P\l, we obtain (after a small computation) 

^[QweylplPl), Qweyl(£lPl)]l = y^P^l-P;L 3 ^l + -PiAp-Pj 2 Xi) 1 

- + P}X,P,X r ()l 

= —Pfx^xlx 

= -—(-inf 4-i. 

12 v ; 

Since 6/9 does not equal 4/12, we have a contradiction. ■ 


13.5 Exercises 

1. Let Vj denote the space of complex-valued homogeneous polynomials 
on M 2 of degree j. Then Vj is a complex vector space of dimension 
j +1, which we may identify with C-P 1 using the obvious basis for Vj . 
Let Vj denote the complex subspace of Vj spanned by polynomials 
of the form (ax + bpf, with a,b £ C. Show that Vj = Vj. 

Hint: Since every subspace of C J+1 is (topologically) closed, if 7 (f) is 
a smooth curve in Vj, the derivative 7 '(t) will also lie in Vj. 
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2. Show that symmetrized pseudodifferential operator quantization of 
x 2 p 2 is equal to Qwey\(x 2 p 2 ) — h 2 /2. 

3. Show that Wick-ordered and anti-Wick-ordered quantizations map 
real-valued polynomials to symmetric operators on £7“ (R). 

Hint: Compare the values of each quantization scheme on z k z l and 
on (z k z l ). 

4. Consider a classical harmonic oscillator with Hamiltonian 

tt ( \ P 2 . 1 22 1 2 ( 2 , ( P_\ 2 \ 

H { X ,P) = 7 , - \--mu x =-mu [x + (-1 , 

2m 2 2 \ \mu t / 

where u is the frequency of the oscillator. Consider the Wick- and 
anti-Wick-ordered quantizations with parameter a = 1 /(mu). Show 
that 

Qwick(7?) = Qwey\(H) — -hi) 

Qanti—Wick(7?) — Q\Vey\{H) “h —TvU. 

5. Let £/ a .b(t) be as in Proposition 13.5. Show by direct calculation that 
these operators form a one-parameter unitary group. 

6 . Given k £ L 2 ( R"xR"), let A K denote the associated integral operator 
on L 2 (R”), as in Proposition 13.6. Show that the adjoint A* of A is 
also an integral operator, with integral kernel k' given by 

K'(x,y) = «(y,x). 

7. Suppose that / £ L 2 (R 2 ") and that / £ L 1 (R 2 ™). Then the right- 
hand side of (13.17) may be understood as an absolutely convergent 
“Bochner” integral with values in the Banach space B(L 2 ( R")). Show 
that Qweyi(/) as defined by (13.17) coincides with Qweyi(f) as de¬ 
fined in Definition 13.7. 

Hint: The Bochner integral commutes with applying a bounded lin¬ 
ear functional. Use this result with the linear functional (A) := 
(4>,Ai/;} on £>(L 2 (R”)). Then use the expression in (13.23) for Kf, 
which follows from Definition 13.7 by applying a partial Fourier trans¬ 
form. 

8 . (a) Show that for any polynomial / in one variable, we have 

QweylifWp) = f{X)P - y/'(*)■ 
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(b) Show that for any two polynomials / and g , the Poisson bracket 
{f(x)p, g(x)p} is of the form h{x)p for some polynomial h. 

(c) Show that for any two polynomials / and g, we have 

^ [Qweyl(f(x)p),Qweyl{g(x)p)} = Qwey\{{f {x)p, g{x)p}). 

9. (a) Given cj> and i/j in L 2 (R n ), let \<t>)(ip\ be the operator defined in 

Notation 3.28. Show that can be expressed as an integral 

operator as in Proposition 13.6 and determine the associated 
integral kernel k. 

(b) For a > 0, let £ L 2 (K") be given by the expression 
ipa( x ) = (tt cr) -ra//4 e - l x l 

Using Proposition A.22, show that is a unit vector in L 2 (R n ) 
and that the Weyl symbol of the corresponding one-dimensional 
projection operator \’i/j a }(ip a \ is given by 

Qw\ yi(I^X^I) = 2 ri e-l x l 2 / CT e—l p l 2 / ft2 . 

Note: If we give a the value h/(mu>), the Gaussian function ip a may 
be thought of as the ground state for an n-dimensional harmonic os¬ 
cillator. (Compare the functions in Theorem 11.3.) The computation 
in this exercise plays an important role in the proof of the Stone-von 
Neumann theorem in Chap. 14.8. 

10. If / and g are Schwartz functions on K 2rl , show that / * g converges 
in the L 1 norm to (27 t )~ n f*g, where * denotes convolution. Conclude 
that / * g converges uniformly to fg as h tends to zero. 

11. Suppose that /(p, q) is a homogeneous polynomial of degree 2. Show 
that for each t, the Hamiltonian flow <f> t associated with / is a linear 
map of K 2rl to itself. 

12. Prove Lemma 13.16. 

Hint: Let g\ £ V 2 be given by 

n 

3i(*,p) = 5>W- 
j'=i 

Show that for any monomial of the form xJp k , we have {< 7 i,xjp k } = 
(|k| — |j|)x-jp k . Thus, most of the standard basis elements / for V 2 
and all of the standard basis elements / for V 3 can be obtained as 
nonzero multiples of {<?i, /}. 
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The Stone-von Neumann Theorem 


The Stone-von Neumann theorem is a uniqueness theorem for operators 
satisfying the canonical commutation relations. Suppose A and B are two 
self-adjoint operators on H satisfying [A,B\ = ihl. Suppose also that A 
and B act irreducibly on H, meaning that the only closed subspaces of 
H invariant under A and B are {0} and H. Then provided that certain 
technical assumptions hold (the exponentiated commutation relations), we 
will conclude that A and B are unitarily equivalent to the usual position 
and momentum operators X and P. That is, there is a unitary operator 
U : H —>■ L 2 (R) such that [/AC/ -1 = X and C/BC/ -1 = P. If H is not 
irreducible, then it decomposes as a direct sum of invariant subspaces V) 
for A and B , and the restrictions of A and B to each Vi are unitarily 
equivalent to the usual X and P. 

We begin this chapter with a heuristic argument for the Stone-von Neu¬ 
mann theorem, an argument that glosses over certain (essential but tech¬ 
nical) domain issues. Then we introduce the exponentiated commutation 
relations, which should be thought of as a sort of mild strengthening of 
the ordinary canonical commutation relations. Finally, we give a precise 
statement of the theorem and provide a proof. 


14.1 A Heuristic Argument 

Suppose that A and B are any two (possibly unbounded) self-adjoint op¬ 
erators on a separable Hilbert space H satisfying [A, B] = ihl. What we 
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would like to conclude is that H looks like a Hilbert space direct sum of 
closed subspaces V) that are invariant under A and B , and such that each 
Vi is unitarily equivalent to L 2 (R) in a way that turns the operators A and 
B into the standard position and momentum operators X and P. That is 
to say, we hope to find unitary maps Ui : V) —> L 2 (R) such that 

UiAUf 1 = X 
UiBUf 1 = P. 

This conclusion is, however, not quite correct, for reasons having to do 
with the domains of the relevant operators. Nevertheless, let us consider 
a heuristic argument for this conclusion. We start by forming a lowering 
operator a and a raising operator a* by analogy to the definitions of a and 
a* in Chap. 11: 

mcoA + iB mujA — iB 

a = — ; a = — 

v2 hmui v2 hmuj 

Then we look at the kernel W of the lowering operator a, which will be a 
closed subspace of H, provided that a is a closed operator. The elements 
of W may be thought of as “ground states” for the operator a*a. Choose 
an orthonormal basis {</>q} f° r W and define vectors 



It is not hard to show that for l ^ l', (j)^ is orthogonal to for all to and 
to'. Let Vi denote the closed span of the vectors ip l m , m = 0,1,2,_ 

Using the calculation in Sect. 11.2, we can see that the way a and a* act 
on each chain (the vectors il> l m with l fixed and to varying) is precisely the 
same as the way the standard lowering and raising operators a and a* act 
on the chain of eigenvectors for a*a. Thus, for each l, we can construct a 
unitary map Ui from Vi to L 2 (R) by mapping the vectors in Vj to the 
vectors ip m in L 2 (R) described in Theorems 11.3 and 11.4. (In particular, 
the vector ipQ £ L 2 (R) is the ground state for the harmonic oscillator, which 
is a Gaussian.) Since the formula for how a and a* act is the same as the 
formula for how a and a* act, Ui will “intertwine” a with a and a* with 
a and a*, meaning that Uj,a = aUi , and similarly for a* and a*. It follows 
that Ui also intertwines A with X and B with P. 

It remains only to argue (heuristically) that the spaces V) fill up the whole 
Hilbert space H. Clearly, the span V of the V)’s is invariant under both 
a and a*. Thus, the orthogonal complement V' L of V is invariant under 
the adjoints a* and a. If V 1 - is not zero, then arguing as in Chap. 11, 
there should be a ground state in V^, that is a nonzero vector annihilated 
by a. This vector would be orthogonal to all the <j> l 0 ’s, contradicting the 
assumption that the <^q’s form an orthonormal basis for the kernel of a. 

The preceding heuristic argument cannot be completely rigorous, how¬ 
ever, since the counterexample in Sect. 12.2 gives a pair of operators A 
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and B that satisfy the canonical commutation relations but are clearly not 
unitarily equivalent to the usual position and momentum operators. After 
all, the “position” operator A in that section is a bounded operator, which 
cannot be unitarily equivalent to the usual position operator. 

What goes wrong is, as usual, a matter of domain considerations. Setting 
m, h , and u equal to 1, we can look for a vector (f >o that is annihilated by 
the operator 

a = 75 {A+lB) = 75 { x + tic)' 

By the same argument as in Chap. 11, </>o must be a constant multiple of the 
function e~ x / 2 . The function (j>\ := a*4> o is then a multiple of xe~ x / 2 . The 
problem is that (f >i is not in the domain of a *. After all, <f>i does not satisfy 
the periodic boundary condition p(—1) = ^>(1) that defines the domain 
of B. Thus, we cannot continue to apply a* to obtain an orthogonal chain 
of vectors and the entire argument breaks down. 

What we need, then, is some additional condition that will distinguish 
between the “good” cases of the canonical commutation relations and the 
“bad” cases. One possibility for this additional condition is the exponen¬ 
tiated form of the canonical commutation relations, which are discussed 
in the following section. Our rigorous proof (Sect. 14.3) of the Stone-von 
Neumann theorem will follow the same outline as the heuristic argument 
in this section, except that the unbounded operators a and a* will be re¬ 
placed by certain bounded operators, constructed by an analog of the Weyl 
quantization. 


14.2 The Exponentiated Commutation Relations 

If A is a bounded operator on a Hilbert space H, we may define the expo¬ 
nential of A, denoted either e A or exp(A), by the power series 

e A = £ A-, 

z ' ml 

m —0 

where A 0 = I. A standard power series argument shows that if A, B £ 
13(H) commute, then 

e A+B =e A e B , [A,B\= 0. (14.1) 

(See Exercise 6 in Chap. 16.) Even when A and B do not commute, there 
is a formula, called the Baker-CampbelBHausdorff formula, that expresses 
e A e B , for sufficiently small A and B , in the form 

e A e B = exp |a + B + - {A, B\ + [T -®]] H-| ? 
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where the terms indicated by • • • are iterated commutators involving A 
and B. (See Chap. 3 of [21] for more information.) A very special case of 
this formula is obtained in the case where A and B commute with their 
commutator, so that all higher commutators are zero. 


Theorem 14.1 Suppose A, B £ Z3(H) commute with their commutator, 
that is, 

[A, [A,B]] = [B,[A,B}}=0. 


Then 


e A e B = e A+B+l[A,B}_ 


This relation may also be written as 

e A+ B =e ~klA,B] e A e B. 


Note that in this special case of the Baker-Campbell Hausdorff formula, 
no smallness assumption is imposed on A and B. 

Proof. We will prove that 


e tA e tB — ^{A+B)+b^-[A,B] 


(14.2) 


which reduces to the desired result at t = 1. Since [A, B] commutes with 
everything in sight, we can use (14.1) to split the exponential on the right- 
hand side of (14.2) into two and then move the factor involving [ A , B] to 
the other side. Thus, (14.2) is equivalent to the relation 


e tA e tB e -t 2 [A,B ]/2 = e t{A+B) 


(14.3) 


Let a(t) denote the left-hand side of (14.3). We will show that a(t) satisfies 
a simple differential equation, which may be solved explicitly to obtain 
a(t) = e t ( A + B ). 

Using term-by-term differentiation, it is easy to verify that 

— e tc = Ce tc = e tC C 
dt 

for any C £ £>(H), and that 

d_ e -AlA,By2 =e -t>[A,B]/ 2 ( _ t[j4)B])i 

We may then differentiate a(t) using the product rule, which is proved the 
same way as in the scalar case, giving 

— = e tA Ae tB e - t2 ^ B ^ 2 + e tA e tB Be - t2 ^/ 2 
dt 

+ e tA e tB e~ t2 ^ A,B ^ 2 (—t[A, B}). 
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To simplify our expression for da/dt , we need an intermediate result. By 
the product rule 


j t e~ tB Ae tB = e~ tB [A , B]e tB = [A, B\, (14.4) 

because B —and, thus, e B - commutes with [A, B\. Noting that e~ tB Ae tB = 
A when t = 0, we may integrate (14.4) to get 

e ~ tB Ae tB = A + t[A,B\. (14.5) 

(The difference of the two sides of (14.5) has derivative zero, so by Part (a) 
of Exercise 2, the two sides are equal up to a constant, which is seen to be 
zero by evaluating at t = 0.) 

Using (14.5), we obtain 

e tA Ae tB = e tA e tB {e~ tB Ae tB ) = e tA e tB {A + t[A, B)). 

Moreover, since everything commutes with [A, B], we may commute any¬ 
thing we want past e -t Thus, 

(fry 

-A = a (t)(A + t[A, B] + B - t[A, B}) 

= a(t)(A + B). 

Now, according to Exercise 2, the unique solution to the differential equa¬ 
tion da/dt = a(t)(A + B) is a(t) = a(0)e l ( A+B K Since a(0) = /, we obtain 
the desired result (14.3). ■ 

Suppose, now, that A and B are unbounded self-adjoint operators satis¬ 
fying 

[A, B] = ihl , (14.6) 

where the exponentials e lsA and e ltB are defined by means of the spectral 
theorem. If we formally apply Theorem 14.1 to isA and itB (even these 
operators are unbounded), we obtain 

i(sA-\-tB ) _ isth/2 isA itB _ —isth/2 itB isA 

c c c c c o c 


so that 

e isA e itB = e -isth e itB e i S A_ ( 14 . 7 ) 

It is essential to emphasize that the conclusion (14.7) is only formal, since 
it assumes that results for bounded operators carry over to unbounded 
operators, which is false in general. Nevertheless, we may hope that in 
“good” cases, self-adjoint operators satisfying (14.6) will also satisfy (14.7). 

Extending the preceding discussion to the case of several degrees of free¬ 
dom in an obvious way, we are led to the following definition. 
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Definition 14.2 //Hi, ... , A n and B i,.. . , B n are possibly unbounded self- 
adjoint operators on H, the A’s and B’s satisfy the exponentiated com¬ 
mutation relations if the following relations hold for all 1 < j, k < n and 
s,t € R: 

gisAj e itA k _ e itA k gisAj 
e isB je itB k _ e itB ke isBj 
e isA i e itB k — e ~isth6j k e itB k ^isAj 

The operators e lsAj and e ltBk are defined by the spectral theorem for un¬ 
bounded self-adjoint operators, and they are unitary operators, defined on 
all of H. Thus, when we say that the exponentiated commutation relations 
hold, we mean that they hold on the entire Hilbert space H. 

Notation 14.3 Suppose operators Hi, ..., A n and Hi, ..., B n satisfy the 
exponentiated commutation relations. Then for all a and b in M n , let 
e i(a-A+b-B) d eno t e tfo e unitary operator given by 


gi(a-A+b B) _ e ih(a-b)/2 e ia 1 A 1 _ _ _ g ia„ A„ gibi .Bi _ _ _ g ib„B„ 


(14.8) 


Equation (14.8) is nothing but what we obtain by formally applying 
Theorem 14.1 to the operators fa • A and ih ■ B and then further splitting 
the exponentials by formally applying (14.1). The notation may be further 
justified by checking (Exercise 4) that the operators 


U h{t) '= e it2R ( a ' b )/ 2 e i * a i j4 i . . . e ita nA n e itb 1 B 1 . . . g 


itb n B n 


(14.9) 


form a strongly continuous one-parameter unitary group. If we then de¬ 
fine a ■ A + b ■ B as the infinitesimal generator (Sect. 10.2) of f/ a ,b, the 
relation (14.8) will indeed hold. Using the definition of e *( a ' A + b B ) an d the 
exponentiated commutation relations, a simple calculation shows that 


i(a-A+b-B) 2 (a 7 -A+tZ-B) _ — ^(a-b 7 —b-a / )/2 ^(a+a 7 )-A+(b+b / )-B) 


(14.10) 


In particular, e - l ( a A + b B ) j s the inverse of e l ( a ' A+b ' B ), as the notation 
suggests. 

The following examples show that in the good case (the usual position 
and momentum operators on L 2 (R”)), the exponentiated commutation re¬ 
lations do hold, where as in the bad case (the counterexample in Sect. 12.2), 
they do not. 

Example 14.4 Let Aj be the usual position operator Xj acting on L 2 (R”) 
and let Bj be the usual momentum operator Pj . Then the A’s and B ’s 
satisfy the exponentiated commutation relations. 

Proof. Since Xj is just multiplication by Xj. it is easily verified that e IsA_i 
is just multiplication by e lsXj . Meanwhile, the exponentiated momentum 
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operators satisfy (Example 10.16) 

(e ltPi ip)(x) = ^(x + thej). 

It is then evident that e xsXi commutes with e ltXk and that e lsPj commutes 
with e ltPk . We may also compute that 

(e itPk e isXj ijj)(x) = e »(*+t«e»)^( x + the ^ 

= e isth5 ^ k {e isX ^ e itPk if)(x), 

which is what we wanted to prove. ■ 

Example 14.5 Let A be the operator in Sect. 12.2 and let B be the (unique 
self-adjoint extension of) the operator in that section. Then A and B do 
not satisfy the exponentiated commutation relations. 

Proof. The operator A is multiplication by x , and so the operator e lsA 
is just multiplication by e lsx . Meanwhile, the operator B is —iH d/dx, 
with periodic boundary conditions. We will now demonstrate that e ltB 
consists of “translation with wraparound.” Specifically, for any a £ R. and 
if £ L 2 ([— 1 , 1]), let us define S a ip £ L 2 ([— 1 , 1]) by 

(S a ip)(x) = i/j{x + a - 2 m Xta ), 

where m x is the unique integer such that 

— 1 < x + a — 2 m Xt a < 1. 

It is easy to check that S a is a unitary map of L 2 ([0,1]) for each a £ R. 
We then claim that 

e itB =S ht . (14.11) 

To verify the correctness of (14.11), observe that B has an orthonormal 
basis of eigenvectors, namely the functions ip n {x) := e mnx , neZ, with the 
corresponding eigenvalues being irnh. Thus, if we compute e ltB by means 
of the spectral theorem, we have 

e itB if n = e* intH if n . 

On the other hand, 

(SM(x)(e" inx ) = e xin ( x+a ~ 2m ^ 

„—2'irinm x a jxina AKinx 

— g rr.ag g 

= e’ ri "Vn(*), 

showing that e ltB and Snt agree on each of the functions ip n , n £ Z, and 
thus on all of L 2 ([— 1 , 1 ]). 
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Having computed both e lsA and e ltB , we may now easily see that these 
operators do not satisfy the exponentiated commutation relations. We have, 
for example, that 


whereas 


gitB^isA -^ gis(x-\-th—2m Xja ) 


The function e ls ( x + th j s no t equal to e lsth e lsx but rather to 


gisth gisx g—2ism x ,a 

where e - 2lsm *,a j s no t always equal to 1. ■ 


14.3 The Theorem 

We give two versions of the Stone-von Neumann theorem, one for general 
operators satisfying the exponentiated commutation relations and one for 
the special case where the operators act irreducibly. 

Definition 14.6 Operators Hi,..., A n and B i,..., B n satisfying the ex¬ 
ponentiated commutation relations are said to act irreducibly on H if the 
only closed subspaces of H that are invariant under every e ltAj and every 
e ltBi are {0} and H. 

Proposition 14.7 The usual position and momentum operators act irre¬ 
ducibly on L 2 (R”). 

We delay the proof of this result until near the end of this section. 

Theorem 14.8 (Stone-von Neumann Theorem) Suppose Hi,..., A n 

and B i,..., B n are self-adjoint operators on H satisfying the exponentiated 
commutation relations. Then H can be decomposed as an orthogonal direct 
sum of closed subspaces {V{\ with the following properties. First, each Vi is 
invariant under e xtAj and e ltBj for all j and t. Second, there exist unitary 
operators Ui : Vi L 2 (R n ) such that 

Uie^Ujf 1 = e itXj 


and 

Uie itBj U l f 1 = e itFj 

for all j and t. 

If, in addition, the A’s and B’s act irreducibly on H, then there exists a 
single unitary map U : H —> L 2 (R") such that 


Ue itAj U~ 1 = e itXj 
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and 

Ue itBj U~ x = e itP \ 

for all t. The map U is unique up to multiplication by a constant of absolute 
value 1. 

The preceding results can be expressed in terms of the Heisenberg group; 
see Exercise 6. 

Our strategy (as in von Neumann’s 1931 paper [41]) in proving Theo¬ 
rem 14.8 is to follow the outline of the heuristic argument in Sect. 14.1, but 
replacing the unbounded raising and lowering operators by the bounded 
operators e l ( a ' A + bB ) } n Notation 14.3. If we define <j >o £ L 2 (R") by 

</>o(x) = (7rc7)- n/4 e-l x l 2 /( 2ff) 1 (14.12) 

for some cr > 0, then cf 0 is a unit vector, which we may think of as the 
ground state of an n-dimensional harmonic oscillator with frequency w = 
h/(ma). We can easily compute the Weyl symbol of the projection \(fo){(fo\ 
onto (f >o as follows: 


/o(x,p) := QwUltoXtol) = 2 n e-M 2 '°e-^\ 2 / h2 . (14.13) 

(See Exercise 9 in Chap. 13). 

We may define a generalized Weyl quantization Q for H by using the op¬ 
erators e l ( a A + b B ) j n place of the operators e *( aX +b-P) j n (13.17). We will 
show that the operator P := Q(fo) is an orthogonal projection, and we will 
take W := Range(P) as our space of ground states in H. A crucial result 
will be that the projection P is nonzero and, indeed, that the restriction 
of P to any nonzero subspace invariant under the e H a ' A + b ' B )’g is nonzero. 

If {if 1 } is an orthonormal basis for W, consider the vectors 

V4 jb := e i(a - A+b ' B y • 

We will show that these vectors are orthogonal for different values of l, 
and that for fixed l, the inner product of two such vectors is the same 
as in the L 2 (R n ) case. Thus, if Vi denotes the closed span of the if^^s 
with l fixed and a and b varying, we can construct a unitary map from 
Vi to L 2 (R n ) that intertwines the operators e H a A + b B ) with the operators 
e *(a-x+b-P)_ i’i le sum 0 f the yj’s must be all of H, for if not, the orthogonal 
complement Y of the span would be invariant under the e *( aA + bB )’s Thus, 
the restriction of P to Y would be nonzero, implying that there are elements 
of W := Range(P) orthogonal to every if 1 , contradicting the assumption 
that the if 1 ' s span W. 

The rest of this section will flesh out the argument sketched in the pre¬ 
ceding paragraphs. 
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Definition 14.9 Suppose self-adjoint operators A \,..., A n and B i,..., B n 
satisfy the exponentiated commutation relations on H. For any f £ iS(R 2n ), 
define Q(f) £ B(H) by the formula 

Q(f) = {2n)- n [ /(a,b)e i(aA+b ' B) da dh, 

J K 2 " 

where f is the Fourier transform of f and where e l ( a ' A+b ' B ) is as in 
Notation lj.3. The integral is a Bochner integral with values in the Ba¬ 
nach space B( H). 

We will assume the following standard properties of the Bochner integral 
(Sect. V.5 of [46]). First, any continuous function / : R 2ra B{ H) for which 
f ||/(x)|| dx < oo has a well-defined Bochner integral. Second, the Bochner 
integral commutes with applying bounded linear transformations. Third, a 
version of Fubini’s theorem holds. 

Proposition 14.10 For any operators satisfying the exponentiated com¬ 
mutation relations, the associated map Q in Definition lj.9 has the follow¬ 
ing properties. 

1. If f £ S(R 2n ) is real valued, Q(f) is self-adjoint. 

2. For all a and b in K" and f £ S( R"), we have 

e i(a.A+b.B)g (/) = Q (/ /) 

Q(/) ei ( a A+b B ) = Q(f"), 

where f and f" are the functions with Fourier transforms given by 
/'(a', b') = e ^(a' b-a-b ')/2 y-( a / _ a b ' _ b ) 

7"(a', b') = e -ift(»'-b-a-b')/2^( a / _ a , b' - b) 

3. For all f and g in iS(K 2rl ), we have 

Q{f)Q($) = Q(f*g), 

where * is the Moyal product described in Proposition 13.9. 

4■ For all f £ <S(R”), if Q(f) = 0 then f = 0. 

Using both parts of Point 2 of the theorem, we can see that for all 
a.be R n , we have 

e -i(a-A+b-B)g^j e i(a-A+b-B) = 


where 


5(a',bO=e^ a '- b - a - b ')/V,b / ). 


(14.14) 
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Proof. For Point 1, we can re-express Q(f) as 


(2tt)- 


/(a,b)e < ( a - A+b - B )+/(-a,-b)e- < 


—z(a-A+b-B) 


da dh , 


since the change of variable a' = —a, b' = —b brings the second term 
equal to the first term. If / is real valued, then /(—a, —b) is the conjugate 
of /(a, b), so that the expression in square brackets in the integral is self- 
adjoint for each (a, b). 

For the first part of Point 2, we use (14.10) to obtain 
e i{& ' A+hB) Q(f) 

= (27T)- 71 [ e -«(a-b'-b.a')/2^ a / )b /) e i((a+a')-A+(b+b')-B) ^ dh ' 

J R 2 " 

Making the change of variables a" = a' + a and b" = b' + b and simplifying 
gives the desired result. The proof of the second part of Point 2 is similar. 

The proof of Point 3 is precisely the same as the proof of Proposition 13.9, 
which relies only on the exponentiated commutation relations. 

For Point 4, suppose that Q(f) = 0 for some / £ <S(R 2ra ). Then for all 
& H and all a, b £ M n , we have 

0 = ^e i(a ' A+bB V,Q(/)e i(aA+bB V) 

= U, e" i(a A+b B) Q(/)e l(a A+b B V) 

= i&Qig)^) 


where g is as in (14.14). Thus, 

0 = y e «( a '- b - a - b,) /( a , ,b , )(<)>,e i(a ' A+b ' B) V’) da' dh' (14.15) 

for all and a, b. But (14.15) is just computing the inverse Fourier 
transform of the function /(a', b '){(j>, e l< - a A + b B l^), evaluated at the point 
(—a, b). By the Fourier inversion formula, then, this function must be zero 
for almost every pair (a',b'). Now, the function (<j>, e*^ a ’ A + b is a 

continuous function of (a, b) and by taking cj> = e l ( a o-A+b 0 -B)^ ^ can 
made to be nonzero at any given point (ao,bo) in R 2rl , and thus also in 
a neighborhood of that point. Thus, actually, / is identically zero and so 
also is /. ■ 

Lemma 14.11 Let /o be the function on R 2n given by 
/o(x,p) =2 n e- |x|2/CT e-' 7|p|2/R2 , 


where cr is a fixed positive number. Then for all a, b £ K n , we have 

Q(/o)e i(aA+bB) Q(/ 0 ) = e- ff l a l 2 / 4 e- R2 l b l 2 /( 4CT )Q(/ 0 ). (14.16) 
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In particular, 

Q(fo) 2 = Q(/o)- 

Proof. By Proposition 14.10, (14.16) is equivalent to the assertion that 


fo*fo = e 


■' _ -?P|b| 2 /(4 CT ) 


/o- 


(14.17) 


Now, it is certainly possible to establish (14.17) by direct computation from 
the definitions of /g and *; all the integrals involved will be Gaussian inte¬ 
grals, which can be evaluated by means of Proposition A.22. This approach, 
however, is both painful and unilluminating. A more sensible approach is 
to observe that is suffices to verify (14.16) for the ordinary Weyl quantiza¬ 
tion on L 2 (R"). After all, (14.16) is equivalent to (14.17), which in turn is 
equivalent to the identity 

Qweyl(/o)e^ a X+b P) Qweyl(/o) 

= e-l a l 2 / 4 e- R 2 l b l 2 /( 4 ff )Qwe yl (/o), (14.18) 


by applying Proposition 14.10 in the case Q = Qweyi- 

Now, by Exercise 9 in Chap. 13, Qweyi(/o) is the one-dimensional pro¬ 
jection I^qX^oI j where </>o(x) = ( 7 ra) - "/ 4 e - l x l 2/, ( 2a ). Thus, 

Qweyl(/o)e l(a ' A+b B) Qweyl(/o) = l^oX^ol e i(a X+b P) |0 O )<^ol 

= c|0oX0o|, (14.19) 


where 

c = (0o| e^ a X+b p ) |</>o) • 

To compute c, we use (13.20), which gives 

c = (na)- n/2 e ih{ab)/2 [ e Hx| 2 /( 2 A> e *a-x e -|x+/ib | 2 /(2 dx (14.20) 

J K" 

The integral in (14.20) can be computed by expanding |x + fib| 2 , collecting 
terms in the exponent, and applying Proposition A.22. The result, after a 
bit of algebra, is 

c= e -<r|a| a /4 e -n|b| a /(4<7) j 

which gives (14.18). ■ 

We now prove the claimed irreducibility of the usual position and mo¬ 
mentum operators. 

Proof of Proposition 14.7. Given operators A n and B t ,..., B n 

satisfying the exponentiated commutation relations, consider the operator 
Q(fo), where /o is as in (14.13). According to Lemma 14.11, Q(fo ) 2 = 
Q(fo). Since also /o is real valued, Q(fo) is self-adjoint and thus an orthog¬ 
onal projection. Suppose that the range of the orthogonal projection Q{fo) 
is one-dimensional. We then claim that the A’s and B 's act irreducibly. If 
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not, there would exist a nontrivial closed subspace V that is invariant un¬ 
der each of the operators e *( a A + b B ). Then the nonzero subspace V- 1 would 
also be invariant under each of the operators ( e d a ' A + bB ) )* = e -*(a-A+b-B) 
Thus, the exponentiated commutation relations are satisfied in both V and 
V A , with the A’s and B 's being the infinitesimal generators of the restric¬ 
tions of e xtAj and e ltBj to each subspace. 

It follows that the restriction of Q(fo) to each of these subspaces may be 
thought of as the generalized Weyl quantizations for V and V 1 - of the func¬ 
tion / 0 . Applying Point 4 of Proposition 14.10 to V and to V^, we conclude 
that the restrictions of Q(fo ) to V and to V 1 - are nonzero. Thus, both V 
and V 1 - will contain nonzero elements of Range(Q(/o)), contradicting our 
assumption that Range(Q(/o)) is one dimensional. 

In case of L 2 (R ra ), we have Qweyi(/o) = I'/'oX^ol) where (po is given 
by (14.12), which clearly has a one-dimensional range. Thus, the usual 
position and momentum operators act irreducibly on L 2 (K n ). ■ 

We are finally ready for the proof of the Stone-von Neumann theorem. 
Proof of Theorem 14.8. Let W = Range(Q(/o)), where /o is given 
by (14.13) for some fixed a > 0. For (p,ip G W, we can use (14.10), 
Lemma 14.11, and the fact that Q(/o) is the identity on W to obtain 

e *(a-A+b-B)^ e »(a'-A+b'-B)^ 

= (Q(/o)0, „W) 

= e «i(- b '- b -a ')/ 2 ^,Q(/ 0 )e i((a '- a) ' A+(b '~ b) ' B) g(/o)^) 

= e *ft(a- b/ - b - a, )/2 e -o-| a, -a| 2 /4 e -fi. 2 | b/ -b| 2 /(4cr) ^ ^ _ ( 14 _ 2 1) 

Now let {ip 1 } be an orthonormal basis for W and define vectors ip a b , 
a, b G K n , by 

<b =e i(aA+bB) ^. 

By (14.21), ip l a b is orthogonal to ip l a , b , whenever l ^ l'. Furthermore, 

h) ip l , b = e iR ( a ' b '" b ' a, )/ 2 e -^| a '- a | 2 /4 e -?i 2 |b'-b| 2 /(4 ( T)^ (14.22) 

where the right-hand side of (14.22) is “universal,” that is, independent of 
l and independent of the particular Hilbert space in which we are working. 

Let Vi be the closed span of the vectors ip l a b with l fixed and a. b varying. 
We may define a map Ui : V) —»• L 2 (R”) by requiring that 



N 


N 


Ui =5>^.b„ 


u =1 


i=i 


for every sequence ai,..., a^r and bi,..., b n of vectors, where 

<^ a , b = e^ a ' x+b ' p ^o. 
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This map is isometric by (14.22) on linear combinations of the ip^^'s an d 
thus extends uniquely to an isometric map of V) into L 2 (R”). [In particular, 
Ui is well defined: If some linear combination of ip l ab 's is zero, then this 
linear combination has norm zero and so its image under Ui also has norm 
zero and is thus zero in L 2 (R”).] 

Now, Vi is invariant under the operators e l (a A+b B) (14.10), and, simi¬ 
larly, the image of V) under Ui is invariant under the operators e H aX + bp ). 
By the irreducibility of L 2 (M") (Proposition 14.7), we conclude that Vi 
maps onto L 2 (K") and is, therefore, unitary. Furthermore, using (14.10) and 
the analogous expression (13.31) for the position and momentum operators, 
it is easy to check that each Ui intertwines e *( a A + b B ) w ith e l ( aA + bB ) ; f 01 - 
all a, b G K n . In particular, taking either a = tej and b = 0 or a = 0 and 
b = tej we see that Ui intertwines e ltAi with e ltXj . Similarly, Ui intertwines 
e ltB i with e ltPj . 

We now argue that the Hilbert space direct sum of the orthogonal sub¬ 
spaces Vi is all of H. If not, then as in the proof of Proposition 14.7, the 
orthogonal complement Y of this sum would be invariant under the oper¬ 
ators e l ( a ' A + b ' B > an d thus also under the operator Q(/ 0 ). Furthermore, as 
in the proof of Proposition 14.7, the restriction of Q(fo) to Y would be 
nonzero. Thus, there would exist elements of W = Range(Q(/o)) orthogo¬ 
nal to each ip, contradicting the assumption that the ip 1 ' s span W. 

It remains only to address the irreducible case. If the A’s and B's act 
irreducibly, then there can be only one subspace, V\ = H, which means 
that W must be one dimensional. Any unitary map U : H —> L 2 (R”) that 
intertwines each operator e l ( aA + bB ) w jth e *( a X + b p ) mu st also intertwine 
each operator of the form Q(f) with Qweyi(f)- It follows that U must map 
the one-dimensional subspace W unitarily onto the one-dimensional range 
of Qweyi(/o) = \<Po)(<Po\ ■ Thus, the restriction of U to W is unique up to a 
constant of absolute value 1. But the reasoning leading to the existence of 
U shows that U is determined by its action on W, so the entire map U is 
unique up to a constant. ■ 


14.4 The Segal-Bargmann Space 

A simple example of the Stone-von Neumann theorem is provided by the 
Hilbert space H := L 2 (K"), together with the operators Aj := Pj, and 
Bj := —Xj. In that case (Exercise 3), the unitary map U in the Stone-von 
Neumann theorem will simply be a scaled version of the Fourier transform, 
as in Definition 6.1. To obtain a more interesting example, we construct a 
Hilbert space consisting of holomorphic functions on C n . 
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14-4-1 The Raising and Lowering Operators 

A smooth function on F : <C n — > C is said to be holomorphic if it is 
holomorphic as a function of Zj with the other z k s fixed. Equivalently, F 
is holomorphic if dF/dzj = 0, where 


_9_ _ 1 /_9_ ,_9_ 

dzj 2 \9:rj dyj 

The operator 

9 1/9 .9 

dzj 2 \9a:j dyj 

preserves the space of holomorphic functions on C n . 

Considered the operators Zj (i.e., multiplication by Zj) and h d/dzj, 
acting on the space of holomorphic functions on C". Fock [9] observed that 
these operators satisfy the following commutation relations: 


[Zj,Z k \ = 


'_d_ , 9 

dzj ’ dz k 


= 0 


' d 

n—,z k 

OZ-i 


— hS jk i. 


(14.23) 


These are essentially the same commutation relations as the raising and 
lowering operators considered in Sect. 11.2. Specifically, (14.23) are the re¬ 
lations that would be satisfied by the natural higher-dimensional analogs 
of the operators a and a* in that section if we omitted the factor of y/h in 
the denominator in (11.4) and (11.5). 

Now, if we wish to interpret the operators Zj and H d/dzj as raising and 
lowering operators, then we should look for an inner product on the space 
of holomorphic functions that would make these two operators adjoints 
of each other. After all, the analysis in Chap. 11 strongly depends on the 
assumption that a and a* are adjoints of each other. In the early 1960s, 
Segal [36] and Bargmann [2] identified such an inner product. Once we have 
described this Segal-Bargmann inner product, we will construct self-adjoint 
“position” and “momentum” operators as appropriate linear combinations 
of Zj and h d/dzj. We will then verify the exponentiated commutation 
relations and irreducibility, allowing us to apply the Stone-von Neumann 
theorem. 

We look for an L 2 inner product with respect to a measure having a 
positive density with respect to the Lebesgue measure on C n . 


Lemma 14.12 Suppose that g is a smooth, strictly positive density on C n 
and that F and G are sufficiently nice (but not necessarily holomorphic) 
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functions on C n . Then 


f F ( z )^r~t i ( z ) dz 

Jen & Z j 



L 


where dz denotes the 2n-dimensional Lebesgue measure on C n = M 2n . 
Equation (14.24) tells us that 



d \* _ d (9 log (i 

dzj dzj dzj 


where the adjoint is computed with respect to the inner product for the 
Hilbert space L 2 (C n ,fj,). If we restrict the adjoint operator (d/dZj)* to 
the space of holomorphic functions, then the d/dzj term is zero, by the 
definition of a holomorphic function. 

Proof. Let us approximate the integral over C" on the left-hand side 
of (14.24) by an integral over a large cube. By performing either the Xj- 
integral or the xjj -integral first, we can integrate by parts to push the deriva¬ 
tives with respect to Xj or yj off of G and onto the product of F and p 
(with a minus sign). The boundary term in the integration by parts will 
involve the function F(z)G(z)p(z) integrated over two opposite faces of 
the cube. If this function tends to zero sufficiently rapidly at infinity, the 
boundary terms will vanish in the limit. In that case, we obtain 




[ F( Z )G( z) 

J C” 


provided that all three of the above integrals are absolutely convergent. 
Since dF/dzj = dF/dzj and 


dp d log fi d log fi 



we obtain (14.24). ■ 

We now look for a density yn for which d log y/dzj = —Zj/h. In that 
case, the adjoint operator (d/dzj)* preserves the holomorphic subspace of 
L 2 (<C n ,fih) and is given on this subspace by multiplication by Zj/h. 

Lemma 14.13 Specialize Lemma 14-12 to the case in which F and G are 
holomorphic polynomials and /i is the density given by 



(14.25) 
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Then we have 

[ F(z)^-g h (z) dz = \ [ ZjF(z)G{z)g h (z) dz. (14.26) 

J C" & z j h Jc n 

Proof. In the case that F and G are holomorphic polynomials, dF/dzj = 0, 
so the first term on the right-hand side of (14.24) is zero. Furthermore, FGg 
decreases rapidly at infinity and so the boundary terms vanish in this case. 
Finally, we may compute d log /q ( /dzj as —Zj/h, giving (14.26). ■ 

Definition 14.14 The Segal Bargmann space, denoted TIL 2 (C n , gn) Is 

the space of holomorphic functions F on C n for which 

n ■■= \F(z)\ 2 g h (z) dz)j < 00 , 

where is as in (If .25). Define raising and lowering operators a* and 

aj on r HL 2 (C n , gn) by 



with the domain of aj and a* consisting of the space of holomorphic poly¬ 
nomials. 

In light of Lemma 14.13, the operators aj and a* satisfy 
{F, a jG) HL 2( C n^ h ' ) = ( a j F ,G) HL2 ( C 

for all holomorphic polynomials F and G, thus justifying the notation a* 
for the raising operator. The space 'HL 2 (C n , gn) is also sometimes called 
the Fock space. It should be noted, however, that in quantum field the¬ 
ory, the term Fock space also refers to a different (but related) space—the 
completion of the tensor algebra over a fixed Hilbert space. 

Proposition 14.15 The Segal-Bargmann space is complete with respect 
to the norm ||-|| fi and forms a Hilbert space with respect to the associated 
inner product, 

(F,G) h := [ F(z)G{z)g n {z) dz. 

J C" 

Furthermore, the space of holomorphic polynomials forms a dense subspace 
of the Segal-Bargmann space. 

Note that elements of TIL 2 ( C", gn) are actual functions on C n , not equiv¬ 
alence classes of functions. Nevertheless, we can regard 'HL 2 (C n , gn) as a 
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subspace of L 2 (C n , fin), since each equivalence class of almost-everywhere 
equal functions contains at most one holomorphic representative. 

Proof. Given any z 0 £ C n and R> 0, let P Zo .r denote the polydisk given 
by 

Pz 0 = {z £ C"| |Zj - (z 0 )j| < R, j = 

Using a power-series argument, it is easy to show that the value of a holo¬ 
morphic function F at zo is equal to the average of F over P Zo ,r ■ We can 
then multiply and divide by fin to obtain 

F(z 0 ) = p [ —^—F(z) fi h ( z) dz. 

[nR 2 ) n J P ^ R fi h ( z) 

The Cauchy-Schwarz inequality then tells us that 


\FM\ 


< 


(t tR 2 Y 


sup 


M z ) 


Lp *0'- R IL 2 (C",Mft) II"^11 i 2 (C"-,AZft) 


(14.27) 


This inequality tells us that pointwise evaluation [the map F <—> P(zq)] is 
a bounded linear functional on the Segal-Bargmann space. 

Suppose now that F n is a sequence of holomorphic functions such that 
F n converges in L 2 (C n ,fin) to some F. Using (14.27), we can easily show 
that F n converges to F uniformly on compact sets, which implies that F is 
also holomorphic. This shows that the holomorphic subspace of L 2 (C", fin) 
is closed and hence is a Hilbert space. 

To show the denseness of polynomials, consider some F £ RL 2 (C n , fin) 
and let 

F( z)=^a n z n (14.28) 

n 

be the Taylor expansion of F, where n ranges over all multi-indices. This 
series converges to F uniformly on compact subsets of C". We claim that 
the terms in (14.28) are orthogonal. To see this, use Fubini’s theorem to 
perform the integration of z n against z m one variable at a time. Using 
polar coordinates in each copy of C, we can see that we will get zero if the 
power of Zj in z n is not the same as the power of Zj in z m . 

Since it is orthogonal, the series in (14.28) will converge in L 2 {C n ,fin) 
provided that the sum of the squares of the norms of the terms is finite. If 
Pq.r is a sequence of polydisks of increasing radius centered at the origin, 
the argument in the preceding paragraph shows that the terms in (14.28) 
are orthogonal in T 2 (Po,,r, fin)- Since the series converges uniformly on Pq,r, 
we can then interchange sum and integral to obtain 


Ei“»i 2 ii^iib ( ,„,«, = Ilf lib, «,,„«)■ 
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By applying monotone convergence to both the sum over n and the integrals 
over Pq,r, we may let R tend to infinity to obtain 

l“n| H Z lll/ 2 (C n ,/ifc) = ll-^1I.L 2 (C n , W ) < °°- 


Thus, the series in (14.28) converges in L 2 (C",^r) and this L 2 limit must 
coincide with the pointwise limit, namely F itself. ■ 


14-4-® The Exponentiated, Commutation Relations 

To apply the Stone-von Neumann theorem to the Segal-Bargmann space, 
we define self-adjoint “position” and “momentum” operators as follows: 


^ = 71 
Bj = c i 


d 

- +n d. 7 

' 3 dz 0 


We will identify one-parameter unitary groups having (extensions of) these 
operators as their infinitesimal generators, which will show (by Stone’s 
theorem) that the generators are indeed self-adjoint on suitable domains. 
We will then verify the exponentiated commutation relations and check 
irreducibility. 

Let us compute heuristically and then check that our results are correct. 
If we formally apply Theorem 14.1 to the (unbounded) operators djz-j 
and —h^ajd/dzj, we obtain 


exp 





exp 




(14.29) 


This calculation suggests that we define operators T a by the formula 

(T a F)(z) = e^ a l 2 / 2 e- 5 ' z F(z + fia), a e C n , (14.30) 


where for any a, b £ C n , we define a b = a j^j ( n0 complex conjugates). 
Since the exponent on the left-hand side of (14.29) is skew-self-adjoint (the 
difference of an operator and its adjoint), we expect the operators T a to 
be unitary. For suitable choices of a, the operator on the left-hand side 
of (14.29) will become the one-parameter group generated by Aj or Bj. 
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Theorem 14.16 For each a £ C n , the operator T a defined by (14-30) is 
a unitary operator on the Segal-Bargmann space, and the map a i—> T a is 
strongly continuous. These operators satisfy 

T a T h = e iMm(5 ' b) T a+b . (14.31) 

In particular, for each j, the maps 


Uj{t) Ti tej /y/2i Vj(t) T tej /^/2 

are strongly continuous one-parameter unitary groups. The infinitesimal 
generators Aj and Bj of these groups satisfy the exponentiated commutation 
relations. 

For any F £ Dom(A,), we have 

^F)(z) = -h(,,F ( z )+ *g) 

and for any F £ Dom(.Bj), we have 

{B ’ F)(I] = (n (y F(z) - • 

Furthermore, the domains of Aj and Bj contain all holomorphic polyno¬ 
mials. 

Finally, the operators Aj and Bj act irreducibly on the Segal-Bargmann 
space, in the sense of Definition 14-6. 


Proof. It is evident that T a F(z) is holomorphic as a function of z for each 
fixed a. Meanwhile, for any F £ 'HL 2 (C n , fin), we have 

= (* h )~ n [ \F(z + ha)\ 2 e~^dz 

J C n 

= (t Th)~ n [ e -l z + Sa l 2 / s | F(z + ha) I 2 dz 
JC n 

= 11 ^ 11 ^( 0 -,^) ’ 

showing that T a is isometric. The formula for T a T b follows from direct 
computation (Exercise 7), and from this formula we see that T a T_ a = I, 
which shows that T a is surjective and thus unitary. The strong continuity 
of T a is easily verified on polynomials (Exercise 8), which are dense in the 
HL 2 {C n ,p h ). 

It easily follows from (14.31) that Uj(-) and Vj(-) are one-parameter uni¬ 
tary groups, and also that (the infinitesimal generators of) these unitary 
groups satisfy the exponentiated commutation relations. If F is in the do¬ 
main of the infinitesimal generator of Uj(-), the limit 


(AjF)(z) := t lim - \e- nt2 / 4 e itz ^^F(z + ithej/V 2 ) - F( z)l (14.32) 


1 


i t-vo t . 




14.4 The Segal- Bargmann Space 299 


must exist in L 2 (C n ,pn)- The L 2 limit coincides with the easily computed 
pointwise limit, giving 



as claimed. If F is a polynomial, it is easily shown, using dominated con¬ 
vergence, that the limit in (14.32) exists in L 2 (C n ,/is). The analysis of Bj 
is similar. 

Finally, we address irreducibility. If the Ays and Bj y s did not act ir- 
reducibly, then in the application of the Stone-von Neumann theorem to 
HL 2 (C n , pn), there would exist at least two subspaces V). Thus, there would 
exist at least two linearly independent vectors F/ such that for all j, we have 
that Fi is in the domain of A,- and Bj and 



(Take Fj to be the preimage under Ui of the function </>o in (14.12), with a = 
h.) This would mean that each F) is constant, contradicting the assumption 
that the Fp s are linearly independent. ■ 

14-4-3 The Reproducing Kernel 

According to (14.27), evaluation of F £ 'HL 2 (C n , pn) at a fixed point z is 
a continuous linear functional. Thus, this linear functional can be written 
as the inner product with a unique element y z of 'HL 2 (C n ^ pn), which we 
now compute. The vector y z is called the coherent state with parameter z. 

Proposition 14.17 For all F £ 'HL 2 (C n , pn), we have 



(14.33) 


The function e z ' w A is called the reproducing kernel for Hi 2 (C n , pn), 
since integration against this kernel simply gives back (or “reproduces”) 
the function F. Of course, the relation (14.33) holds only for holomorphic 
functions in L 2 (C n , pn). Equation (14.33) can be rewritten as 


F( z ) — {Xz>F) nL 2 ( c n ,nn)’ 


where 


Xz(w) = e 2 W R 


Proof. We begin by establishing the result in the case z = 0. We have 
already established, in the proof of Proposition 14.15, that the Taylor series 
of F converges to F in 'HL 2 {C n , gn), and the distinct monomials in this 
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series are orthogonal. Thus, when computing (1, F) nL2 , c „ only the 
constant term in the expansion of F survives, giving 

(■*-> F)-HL 2 (C n ,ii h ) = ^{0) (1) 1 )«L 2 (C n ,/x/ i ) = ^(0), (14.34) 

since gn is a probability measure. But this relation is precisely the z = 0 
case of (14.33). 

Let us now apply (14.34) to T a F , where T a is the unitary operator 
in (14.30). According to Theorem 14.16, T a is unitary with inverse equal 
to T_ a , giving 

(T a F)(0) = (l,TaF) WL a( C „ iA((i ) = (T_ a l, F) nL 2( C n ^ . 

Writing this relation out using w as our variable of integration gives 

e" R |a| 2 / 2 ^(fe) = J e -Na|72 e a-w F{w)n h (w) dw. 

Setting a = z /h and simplifying gives the desired result. ■ 

14-4-4 The Segal-Bargmann Transform 

Since the operators Aj and Bj in Theorem 14.16 satisfy the exponentiated 
commutation relations and act irreducibly on TLL 2 ( y C n , gn), the second part 
of the Stone-von Neumann theorem tells us that there is a unitary map 
U : 'HL 2 ( C n , gn) —> T 2 (R"), unique up to a constant, that intertwines these 
operator with the usual position and momentum operators. The inverse 
map V : L 2 (K n ) —> HL 2 (C n , gn) is called the Segal-Bargmann transform. 

Theorem 14.18 Let V be the inverse of the map U : 'HL 2 (C n , gn) —> 
L 2 (K n ) given by the Stone -von Neumann theorem, normalized so that V 
takes the function </>o £ L 2 (R n ) in (14-12) (with a = h) to the constant 
function 1 £ 'HL 2 {C n , gT). Then V may be computed as follows: 

(V , 0)(z) = ( nh)~ n B J exp | — — ^z • z — 2-\/2z • x + x • | V’( x ) ^ x - 

Recall that we define a ■ b = ffj f° r a h a, b £ C n , with no complex 
conjugates in the definition. In particular, the integrand in the formula for 
Vtp is a holomorphic function of z, for each fixed x. 

Note that the value of (T7)( z ) at z = 0 is simply the inner product of if 
with the ground state function (f> o, with cr = h. The proof of Theorem 14.18 
will show that the value of (Vif)(z) at an arbitrary z is a certain constant 
c z times the inner product of if with a phase space translate of (fo, that is, 
a vector of the form e la ' x e lb P ()>o. [See (14.36).] According to (the obvious 
higher-dimensional counterpart to) Proposition 12.11, <fio is a minimum un¬ 
certainty state, meaning that equality is achieved in Corollary 12.9 for each 
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j. Thus, by (the obvious higher-dimensional counterpart to) Exercise 3 in 
Chap. 12, each state of the form e * aX e lbp 0 o is also a minimum uncertainty 
state. 

Proof. By the unitarity of V and the z = 0 case of Proposition 14.17, we 
have 

(^0 5 V’)l 2 (R") = {y^,VlP) H L 2 (C n^ h) = (1) = (^VO(O)- 

Thus, the value of Vip at 0 is just the inner product of with (j> 0 . More 
generally, 

(e-“'X e - ib -P^^> = <^ 0 ,e ibP e- x ^) 

= (V</> 0 ,Pe lbp e mX ^) 

= (l,e lbB e laA P^) 

= (e ibB e la ' A P^)(0), (14.35) 

where e la ' A means the product (in any order) of the operators e lajAj , and 
similarly for e lb B . 

Recall that Aj ’s and Bj’s are defined as the inhnitesimal generators 
of the groups Uj and Vj in Theorem 14.16, which in turn are defined in 
terms of the operators T a . If we use (14.31) to compute the right-hand side 
of (14.35), we obtain 

(e lbB e laA PVO(0) = (T h/ ^T ia/ ^m 

= e iRa - b / 2 (T (b+ia)/V ,yV’)(0) 

= e iRa ' b/2 e- R(|a|2+|b|2) / 4 (PV')(^( b + *a)/V2). 

Thus, if we apply (14.35) with a = \/2yo/h and b = \[2xo/h, we obtain 

^g-iv^yo-X/ftg-iv^Xo-P/ft^^ ^ 

= e lXo -yo/ ?i e - (|xo|2+|yo|2)/(2;i) (PV ) )(xo + *yo). (14.36) 

Solving (14.36) for (Vip)(x 0 + *y 0 ) gives 

(V^)(xo +iy 0 ) = (7r^)-"/ 4 e- lx °' yo/R e (|xo|2+|yo|2)/(2R) 

X [ e iV2yo-x/R e -|x^v/2x 0 | 2 /(2R)^( x ) ^ 

JR n 

which simplifies to the claimed formula for Vi\). ■ 


14.5 Exercises 

1. Show that if operators A and B satisfy the exponentiated commu¬ 
tation relations of Sect. 14.2, they satisfy the “semi-exponentiated” 
commutation relations, that is, the hypotheses of Theorem 12.8. 
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Hint : For any a, s £ K and ip £ Dom(H), rearrange the expression 


e isA( e iaB^ _ ( e iaB^ 


S 


using the exponentiated commutation relations. Then let s tend to 
zero and apply Stone’s theorem. 


2. (a) Suppose a : R — > £>(H) is a differentiable map, meaning that 

a(t + h) — a(t) 

Inn --- 

h — >-0 h 


exists in the norm topology of *B(H) for each t. Show that if 
da/dt = 0 for all t, then a is constant. 

(b) Suppose a : R —> B( H) is a differentiable map such that 


da 

dt 


a(t)A 


for some fixed A £ B( H). Show that a(t) = a(0)e tA for all t. 


3. Show that the operators Aj := Pj and Bj := —Xj on L 2 (R") sat¬ 
isfy the exponentiated commutation relations. Determine the unitary 
operator U : L 2 (R n ) L 2 (R”) (unique up to a constant) such that 

Ue itAj U~ l = e itXi 
Ue^U- 1 = e itPj . 


4. Verify that the operators U a ^{t) in (14.9) form a strongly continuous 
one-parameter unitary group. 

5. In this exercise, we develop a discrete version of (the n = 1 case of) 
the Stone-von Neumann theorem. Let p be a prime number, let Z /p 
denote the field of integers modulo p , and let h be a nonzero ele¬ 
ment of Z Ip. Consider the finite-dimensional Hilbert space L 2 (Z/p), 
taken with respect to the counting measure on Z jp. Let U denote the 
“modulation” operator 


(Uf)(n)=e 2 ^Pf(n) 

and let V denote the “translation” operator on L 2 (Z /p), given by 
C Vf)(n ) = f{n + h). 


In the case of the modulation operator, note that the expression 
6 2 t xm/p d escenc i s unambiguously from n £ Z to n £ 'Ljp. 
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(a) Verify that U p = V p = I and that, for all l and m in Z, 

jjlym g — 27 r ilm/p-ymjjl 

(b) Suppose now that A and B are unitary operators on a finite¬ 
dimensional Hilbert space H satisfying A p = B p = I and 

g — 27 t ilm/p 

Suppose also that the only subspaces of H invariant under both 
A and B are {0} and H. Show that there is a unitary map W 
from H to L 2 (Z/p) such that 

WAW- 1 = U 
WBW- 1 = V. 

Hint: Show that if v G H is an eigenvector for A, then so is 
B l v for any l. Show that each eigenspace for A has dimension 1 
and identify the associated eigenvectors with the “^-functions” 
in L 2 (Z/p). 

6. Given a constant u £ C with |u| = 1 and a pair of vectors a, b £ K", 
let (7 Uj a,b be the unitary operator on L 2 (M”) given by 

(b r „, a ,b^)(x) = ue ia ' x ^(x + hb). 

(a) Verify that the set of operators of this form a group under the 
operation of composition, and denote this group by H n . 

(b) Let H n denote the set of (n + 2) x (n + 2) matrices of the form 

/ 1 a i • • • a n c \ 

1 h 

a = : , 

i K 

V 1 / 

with a\,...,a n and b\,...,b n in M. (The only nonzero entries 
in A are on the main diagonal, in the first row, and in the last 
column.) Verify that H n forms a group under matrix multipli¬ 
cation. Show that there is a surjective group homomorphism 
$ : H n —> H n with discrete kernel. 

Hint: Compare the formulas for group multiplication in H n 
and H n . 

Note: In the language of Chap. 16, H n is the universal covering group 
of H n . The group H n is called the Heisenberg group. 
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7. Show by direct computation that the operators T a in (14.30) satisfy 
the relations (14.31). 

8 . Using dominated convergence, show that for every holomorphic poly¬ 
nomial F on C n , we have 

where T a is as in (14.30). 


15 

The WKB Approximation 


15.1 Introduction 

The WKB method, named for Gregor Wentzel, Hendrik Kramers, and Leon 
Brillouin, gives an approximation to the eigenfunctions and eigenvalues of 
the Hamiltonian operator H in one dimension. The approximation is best 
understood as applying to a fixed range of energies as h tends to zero. (It 
is also reasonable in many cases to think of the approximation as applying 
to a fixed value of h as the energy tends to infinity.) 

The idea of the WKB approximation is that the potential function V ( x ) 
can be thought of as being “slowly varying,” with the result that solutions 
to the time-independent Schrodinger equation will look locally like the so¬ 
lutions in the case of a constant potential. In the classically allowed region, 
this line of thinking will yield an approximation consisting of a rapidly os¬ 
cillating complex exponential multiplied by a slowly varying amplitude. We 
make the “local frequency” of the exponential equal to what it would be if 
V were constant. Having made this choice, there is a unique choice for the 
amplitude that yields an error that is of order h 2 . This amplitude, however, 
tends to infinity as we approach the “turning points,” that is, the points 
where the classical particle changes directions. Similarly, in the classically 
forbidden region, we obtain approximate solutions that are rapidly grow¬ 
ing or decaying exponentials, multiplied by a slowly varying factor. Again, 
there is a unique choice for the slowly varying factor that gives errors of 
order h 2 , and again, this factor blows up at the turning points. 
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The difficulty near the turning points means that we cannot directly 
“match” the approximate solutions in different regimes the way we did in 
Chap. 5. Instead, we will use the Airy function to approximate the solution 
to the Schrodinger equation near the turning points. Asymptotics of the 
Airy function will then yield the appropriate matching condition, which 
turns out to be a corrected form of the Bohr-Sommerfeld rule that appears 
in the “old” quantum theory. 

15.2 The Old Quantum Theory and the 
Bohr-Sommerfeld Condition 

The old quantum theory, developed by Bohr, Sommerfeld, and de Broglie, 
among others, may be pictured as follows. Consider, for simplicity, a par¬ 
ticle with one degree of freedom, and let C be a level set in phase space of 
the Hamiltonian, 


C = {{x,p) GR 2 \H{x,p) = E} , (15.1) 


which we assume to be a closed curve. We now imagine drawing a “wave” 
on C, that is, some oscillatory function defined over C. Following the de 
Broglie hypothesis (Sect. 1.2.2), we postulate that the local frequency k of 
the wave as a function of x is p/h. This means that the phase of our wave 
should be obtained by integrating the 1-form 

—p dx (15-2) 

n 

along the curve. Thus, the wave itself can be pictured as a function on C 
of the form 

cos iyj j P dx — dj , (15.3) 

where xo is some arbitrary starting point on the curve C and where 8 is an 
arbitrary phase. Note that the old quantum theory did not offer a physical 
interpretation of this wave; it was simply a crude attempt to introduce 
waves into the picture. 

The Bohr-Sommerfeld condition is simply the requirement that the func¬ 
tion in (15.3) should match up with itself when we go all the way around 
the curve. This will happen precisely if 


1 

h 


p dx = 2irn, 


ic 


(15.4) 


for some integer n. The energy levels in the old quantum theory were taken 
to be those numbers E for which the corresponding level curve C sat¬ 
isfies the Bohr-Sommerfeld condition (15.4). Although Bohr-Sommerfeld 
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quantization had some successes, notably explaining the energy levels of 
the hydrogen atom, it ultimately failed to correctly predict the energies of 
complex systems. 

For systems with one degree of freedom, a vestige of the Bohr-Sommerfeld 
approach survives in modern quantum theory, with two modifications. 
First, the condition (15.4) has to be corrected by replacing the n by n+ 1/2 
on the right-hand side of (15.4). (The replacement of n by n+ 1/2 is known 
as the Maslov correction.) Second, this condition does not (in most cases) 
give the exact energy levels, but only the leading-order semiclassical ap¬ 
proximation to the energy levels. The preceding discussion leads to the 
following definition. 

Condition 15.1 A number E is said to satisfy the Maslov-corrected Bohr- 
Sommerfeld condition if 



(15.5) 


for some integer n, where C is the classical energy curve in (15.1). In light 
of Green’s theorem, this condition may be rewritten as 


(Area enclosed by C) = n+ —. 


When the Maslov correction is included, the Bohr-Sommerfeld condition 
can be stated as saying that the wave with phase given by integrating the 
1-form in (15.2) should be 180° out of phase with itself after one trip around 
the energy curve. Figure 15.1 shows an example, which should be contrasted 
with Fig. 1.3. (Note also that Fig. 1.3 is drawn in the configuration space, 
whereas Fig. 15.1 is in the phase space.) 

In our analysis in the subsequent sections, we will see that the Maslov 
correction—that is, the extra 1/2 in (15.5), as compared to (15.4)— actually 
consists of a contribution of 1/4 from each of the two “turning points” of 
the classical particle. (The turning points are the points where the classical 
particle changes directions.) Specifically, in the WKB approximation, the 
phase of the wave function will be computed as the integral of ( p dx)/h 
along one “branch” of the classical energy curve C. Using the Airy function 
to approximate the wave function near the turning points, we will obtain 
an “extra” 7 t/4 of phase between each turning point and the last local 
maximum or minimum of the wave function. Because of the two branches 
of C, the extra 7 r /4 of phase near each of the two turning points actually 
contributes an extra n to the integral on the left-hand side of (15.5). 

The reader may wonder why there is no comparable correction term 
in our discussion of the Bohr-de Broglie model of the hydrogen atom in 
Sect. 1.2.2. One way to answer this question is as follows. As we will see in 
Sect. 18.1, the Schrodinger operator for the hydrogen atom can be reduced 
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P 



FIGURE 15.1. A trajectory satisfying the corrected Bohr-Sommerfeld condition 
with n = 10. 


to a one-dimensional Schrodinger operator with an effective potential of the 
form 


V'effO’) 


Q 2 h 2 i(i + 1 ) 

r 2 mr 2 


Here l is a non-negative integer that labels the “total angular momentum” 
of the wave function. At least when l > 0, one can analyze this Schrodinger 
operator using a WKB-type analysis very similar to the one in the current 
chapter, with one important modification: The radial wave function [the 
quantity h(r) in (18.5)] must be zero at r = 0 in order for the wave function 
to be in the domain of the Hamiltonian. 

If one analyzes the situation carefully, it turns out that the zero boundary 
condition at r = 0 introduces another correction into the Bohr-Sommerfeld 
condition in the amount of 1/2. There is still also a correction of 1/4 for 
each of the two turning points, leading to the condition 


1 

h 



dx = 2n 



2n (n + 1). 


Since n + 1 is again an integer, we are effectively back to the uncorrected 
Bohr-Sommerfeld condition. See Chap. 11 of [8] for a discussion of different 
approaches to the WKB approximation for radial potentials. 


15.3 Classical and Semiclassical Approximations 

We are interested in finding approximate solutions to the time-independent 
Schrodinger equation, 


L^ + {V{x) - EMx) = 0 


(15.6) 
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for small values of h. Ultimately, we will need to analyze the behavior of 
solutions in three different regions, the classically allowed region [points 
where V(x) < E], the classically forbidden region (points where V(x) > 
E ), and the region near the “turning points,” that is, the points where 
V(x) = E. 

Let us consider at first the classically allowed region. Given a potential 
V and an energy level E, we can solve (up to a choice of sign) for the 
momentum of a classical particle as a function of position as 

p(x) = \/2 m(E — V(x)). 

We look for approximate solutions ip to (15.6) of the form 

iP{x) = A(x)e ±lS ^ /h , (15.7) 

= p{x). Note that we are taking the phase of our 

phase = ±— j p(x) dx, 

as in the old quantum theory in Sect. 15.2. The “amplitude function” A{x) 
will be chosen to be independent of h and thus “slowly varying” (for small h) 
compared to the exponent S(x) /h. 

Our first, elementary, result is that for any number E for which there is 
a classically allowed region and for any reasonable choice of the amplitude 
A(x) in (15.7), we obtain an approximate eigenvector solution to the time- 
independent Schrodinger equation, with an error term of order h. 

Proposition 15.2 For any two numbers E\ and E^ with E\> inf xg R V (x), 
there exists a constant C and a nonzero function A £ C£°(M) with the 
following property. For every E £ [E\,Eq\, the support of A is contained 
in the classically allowed region at energy E and the function if given by 

ip(x) = A(x)exp 1^ 7 J p( x ) d x 

satisfies 

\\Hi,-E^\\<ChU\\- (15-8) 

Proof. For any E £ [Ei , E 2 ] , the classically allowed region for energy E 
contains the classically allowed region for energy E\. We choose, then, A to 
be any nonzero element of C^°(K) with support in the classically allowed 
region for energy E\. If we evaluate Hip — Epi by direct calculation, there 
will a term in which two derivatives fall on the exponential factor, bringing 
down a factor involving p(x) 2 . The definition of p(x) is such that the term 


where S satisfies S'(x) = 
wave function to be 



310 


15. The WKB Approximation 


involving p(x) 2 will cancel the term involving V(x) — E , leaving us with 



(15.9) 


(Here, each occurrence of the symbol ± has the same value, either all pluses 
or all minuses.) Thus, 


||^-^|| < ^-\\A"\\ + ^\\2A'p+A P '\\. 


(15.10) 


Since ||0|| is independent of h , the right-hand side of (15.10) is of order 
h ||'0||. It is easy to check that \\2A’p + Ap'\\ is bounded as a function of E 
for any E in the range [Ej , E-^] and the result follows. ■ 

Proposition 15.2, along with elementary spectral theory, tells us that for 
any E larger than the minimum of V, there is a point E in the spectrum 
of H such that 


\E-E\< ch. 


(15.11) 


(See Exercise 4 in Chap. 10.) If we assume that V(x ) tends to +00 as 
x —> ± 00 , then H will have discrete spectrum and we can say that E is 
an eigenvalue for H. The conclusion, for such potentials, is this: Given any 
number E £ [E±, E 2 ], there is an eigenvalue of H within Ch of E. Thus, as 
h tends to zero, the eigenvalues of H “fill up” the entire range of values of 
the classical energy function. 

Proposition 15.2 is one manifestation of the “classical limit” of quantum 
mechanics: the quantum energy spectrum is, in a certain sense, approxi¬ 
mating the classical energy spectrum as h gets small. Notice, however, that 
this result tells us only that the eigenvalues are at most order h apart and 
nothing further about the location of the individual eigenvalues. 

In this chapter, we will show that if E satisfies the corrected Bohr- 
Sommerfeld condition, then there exists an eigenvalue E of H such that 


\E-E\< Ch 9/8 . 


(15.12) 


An estimate of the form (15.12) locates eigenvalues with an error bound 
that is small compared to the expected average spacing between the eigen¬ 
values, which is of order h. On the other hand, the approximate energy 
levels E are determined by Condition 15.1, which is a condition on the 
classical energy curve. Thus, (15.12) can be described as a semiclassi- 
cal estimate: It is estimating quantum mechanical quantities (the indi¬ 
vidual energy levels) in classical terms (the level curves of the classical 
Hamiltonian). 
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15.4 The WKB Approximation Away 
from the Turning Points 

We consider only the simplest interesting case of the WKB approximation, 
in which the following assumption holds. See the book of Miller [30] for 
much about this sort of asymptotic analysis. 

Assumption 15.3 Consider a smooth, real-valued potential V(x), with 
V(x) -A +oo as x —> ±oo. Assume that the functions V'(x)/V(x) and 
V"(x)/V(x) are bounded for x near ± oo. 

Consider also a range of energies of the form E\ < E < E 2 . Assume 
that for each E in this range, there are exactly two points, a(E) and b(E), 
with a(E) < b(E), for which V(x) = E. Further assume that the derivative 
of V is nonzero at a(E) and b(E), for all E £ 

See Fig. 15.2 for a typical example. Since V is locally bounded and tends 
to ±00 at infinity, H is essentially self-adjoint on (7“(K) (Theorem 9.39) 
and has purely discrete spectrum (Theorem XIII. 16 in Volume IV of [34]). 
The assumption that V'/V and V"/V be bounded near infinity is stronger 
than necessary, but still applies to most of the interesting cases. 

We refer to a{E ) and b{E) as the turning points , since these are the 
points where a classical particle with energy E changes direction. When 
the energy E is understood as being fixed, we will write the turning points 
simply as a and b. 


15.4.1 The Classically Allowed Region 


As in Sect. 15.3, we seek approximate solutions to the time-independent 
Schrodinger equation having the following form in the classically allowed 
region: 


if = A{x) exp 



p{x) dx 


(15.13) 


where p(x) = ^/2m{E — V(x)) is the momentum of a classical particle with 
energy E and position x. According to (15.9), this form for if gives 


Hif — Eif = — -— ^A"(x) ± —2 A'{x)p{x) ± jp\x)A{x 


x exp “j ± 7 / p(x) dx . 


(15.14) 


Since we want to obtain an approximate solution with an error smaller 
than h , we require that the second and third terms in parentheses in (15.14) 
cancel. This cancellation will occur if A satisfies 


2A'{x)p{x) = —p'(x)A(x) 
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or 

A'{ x) lp'(x) 

A{x) 2 p(x) ’ 

which we can easily solve (Exercise 3) as 

A(x) = C(p(x))~ V 2 . 
If A is given by (15.16), we will have 


Hip — Eip = 


h 2 A"{x) 
2m A(x ) 


ip(x), 


(15.15) 


(15.16) 


(15.17) 


indicating that our error is of order h 2 . This expression, however, is only 
local, in that it applies only in the classically allowed region. Furthermore, 
p(x) tends to zero at the turning points, which means that A(x) becomes 
unbounded at these points. This blow-up of the amplitude is a substantial 
complicating factor in the analysis. 

We can get an approximate solution to the Schrodinger equation by tak¬ 
ing a linear combination of the function in (15.13) with two different choices 
for the sign in the exponent, with constants Ci and C 2 - It is convenient to 
take the basepoint of our integration to be the left-hand turning point 
a = a(E). Furthermore, since the Schrodinger operator H commutes with 
complex conjugation, the real and imaginary parts of any solution to the 
time-independent Schrodinger equation is again a solution. We will there¬ 
fore consider only real-valued approximate solutions, i.e., those in which 
C 2 = cl- Using Exercise 1, we can then write our approximate solution as 
follows. 
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Summary 15.4 Suppose if) is a real-valued solution to the time-independent 
Schrodinger equation. Then in the classically allowed region but away from 
the turning points, we expect that if is well approximated by an expression 
of the form 

r(vUv -i}. ( 15 . 18 ) 

where p{x) = ^/2 m(E — V(x)) is the momentum of a classical particle with 
energy E and position x. Here R and 5 are real constants, referred to as 
the amplitude and the phase of the approximate solution. 

We refer to the function in (15.18) as the oscillatory WKB function. In 
integrating the square of the oscillatory WKB function over some interval, 
we may apply the identity cos 2 9 = (1 + cos(20))/2 to the cosine factor. 
The rapidly oscillating cos(20) term will be small for small h because of 
cancellation between positive and negative values. Thus, the integral of 
ip 2 (x) over an interval will be, to leading order, just a constant times the 
integral of 1 /p(x), or, equivalently, a constant times \/v[x), where v is 
the velocity of the classical particle. But the integral of l/v(x) = dt/dx 
with respect to x is just the time t that the classical particle spends in the 
interval. We obtain, then, the following result. 

Conclusion 15.5 If the amplitude R in (15.18) is chosen so that ip has 
L 2 norm 1 over [a, b], then the probability of finding the quantum particle in 
an interval [c , d] C [a, b] is approximately the fraction of time the classical 
particle spends in [c, d] over one period of classical motion. 

15.4-2 The Classically Forbidden Region 

In the classically forbidden region, let us introduce the quantity 
q(x) := \j2m(V{x) — E). 

We look for approximate solutions to the Schrodinger equation (15.6) of 
the form 

ip(x) = A(x)exp j±i j q(y) dy 

If we analyze approximate solutions of this form precisely as in the classi¬ 
cally allowed region, we again find that there is a unique choice for A (up 
to multiplication by a constant) that causes the order-Ii terms in Hip — Eip 
to cancel, namely A(x) = C(q(x))~ 1//2 . If we are hoping to approximate a 
square-integrable solution of the Schrodinger equation, we want to take a 
minus sign in the exponent on the interval (6, oo), and it is convenient to 
the basepoint of our integration to be b. In the region (—oo, a), we want to 
take a plus sign in the exponent; it is then convenient to take the basepoint 
of our integration to be a and to reverse the direction of integration, which 
changes the sign in the exponent back to being negative. 
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FIGURE 15.3. The WKB functions, extended all the way to the turning points. 


Summary 15.6 Ifipi(x) is a solution to the time-independent Schrodinger 
equation that tends to zero as x approaches — oo, we expect that ipi will be 
well approximated on (—oo, a), but away from the turning point, by the 
expression 


Cl 


exp 


l 


q(y) dy 


(15.19) 


where q{x) = y/2m(V[x) — E). Meanwhile, if ip 2 {x) is a solution to the 
time-independent Schrodinger equation that tends to zero as x approaches 
+ 00 , we expect that ip will be well approximated on (b , + 00 ), but away from 
the turning point, by the expression 


(15 ' 20) 

We refer to the functions in (15.19) and (15.20) as the exponential WKB 
functions. The general theory of ordinary differential equations tells us that 
any solution to the time-independent Schrodinger equation for a smooth 
potential is smooth. Thus, the singularity at the turning points is an artifact 
of our approximation method. Nevertheless, for small values of h, the true 
solution will “track” the WKB approximation until x gets very close to 
the turning point, with the result that the true solution will be large, but 
finite, near the turning points. 

Figure 15.3 plots a potential function V(x), an energy level E, and the 
WKB functions in both the classically allowed and classically forbidden 
regions. In the figure, the WKB functions have been (improperly) used all 
the way up to the turning points. 
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15.5 The Airy Function and the Connection 
Formulas 


For any constant Ci and any energy level E, we expect that there is a unique 
solution of the Schrodinger equation (15.6) that is well approximated 
for x tending to — oo by a function of the form (15.19). We expect that this 
solution will be well approximated in the classically allowed region (but 
not too close to the turning points) by a function of the form (15.18) for 
a unique pair of constants R and S. In this section, we will see that the 
correct choices for R and S are 

R = 2ci, (15.21) 

The formula (15.21) for R and S is called a connection formula ; there is a 
similar formula connecting an approximate solution that tends to zero as x 
tends to +oo to an approximate solution in the classically allowed region. 
By comparing the two connection formulas, we will obtain conditions on 
the energy E under which the two approximate solutions (one that decays 
near — oo and one that decays near +oo) agree up to a constant in the 
classically allowed region. The condition on E will turn out to be precisely 
Condition 15.1. 

The discussion in the previous paragraph should be compared to the 
analysis in Chap. 5, where we determined the constants for the solution 
inside the well in terms of the energy level and the constant in front of 
the exponentially decaying solution outside the well. Here, of course, the 
analysis is more complicated because neither of the approximations (15.19) 
or (15.18) is valid near the turning point. The connection formula will be 
obtained, then, by using the Airy equation to approximate the Schrodinger 
equation near the turning points. 

To get a reasonable approximation of our wave function near the turning 
points, we approximate V locally by a linear function. (By contrast, in the 
WKB functions, we are essentially thinking of V as being locally constant.) 
Thus, for example, near the turning point a, we write V(x) ~ (a — x)Fq, 
where F 0 = —V'(a), yielding the approximate equation 


h 2 cPip 
2m dx 2 


+ (a - x)F 0 il> = 0. 


By making the change of variable 


/ 2mF 0 
{-tf- 



x) 


we can reduce the equation to 


(15.22) 


d 2 0 
du 2 


ui/j(u) = 0 , 


(15.23) 
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which is the Airy equation. 

Equation (15.23) has two linearly independent solutions, denoted Ai(u) 
and Bi(u). We are interested in the solution Ai(u), since this is the one 
that decays for u > 0, that is, for x < a. The function Ai(u) is defined by 
the following convergent improper integral 



(15.24) 


Intuitively, convergence is due to the very rapid oscillation of the integrand 
for large t, which produces a cancellation between the positive and nega¬ 
tive values of the cosine function. Rigorously, convergence can be proved 
using integration by parts, as in Exercise 6. By differentiating under the 
integral sign (Exercise 7), one can show that Ai indeed satisfies the Airy 
equation (15.23). 

As |u| gets large, the integrand in (15.24) becomes more and more rapidly 
oscillating, producing more cancellation. The only exception to this behav¬ 
ior is when the derivative (with respect to t) of the function t 3 /3+ut is zero. 
Near such a point, the argument of the cosine function is changing slowly 
and there is little oscillation. If u is negative, there is a unique critical point 
of t 3 /3 + ut , at t = y/—u, and we expect that the main contribution to the 
integral in (15.24) will come from t ~ y/—u. If u is positive, t 3 /3 + ut has no 
critical points, and we expect that the integral in (15.24) will become quite 
small as u tends to + 00 . This sort of reasoning can be used to determine 
the precise asymptotics of the Airy function as u tends to +00 and as u 
tends to — 00 ; see the discussion following (15.32) and (15.33). 

We now state our main result, which will be derived in the remainder of 
this section. The result is not rigorous, because we have not estimated any 
of errors involved; such error estimates will be performed in Sect. 15.6. 

Claim 15.7 If ipi is a solution of the Schrodinger equation (15.6) that 
tends to zero near — 00 , then ifi can be normalized so that the following 
approximations hold 



(near — 00 ) 


(15.25) 



(near x = a) (15.26) 



(a < x < b). 


(15.27) 


nere Xq = 
a or to b. 


Here Fq = —V'(a) and in the case of (15.21), x should not be too close to 
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Similarly, if ip 2 is a solution of the Schrodinger equation (15.6) that 
tends to zero near + 00 , then ip 2 can be normalized so that the following 
approximations hold 


ip2(x) 

ip2(x) 

ip2{x) 


-A_cos{-l£ P („)d„ + J 

(2mF 1 h) 1 /e Al [\ n? ) ( 


(a < x <b) 

6)^ (near x 
(near + 00 ). 


(15.28) 

b) (15.29) 
(15.30) 


Here F\ = V'(b) and in the case of (15.28), x should not be too close to a 
or to b. 

The approximate formulas for ip\ and 1(2 will agree, up to multiplication 
by a constant, in the classically allowed region if and only if we have 


1 

h 


p(x) dx 



(15.31) 


for some non-negative integer n. 


More specifically, (15.27) and (15.28) are equal when the integer n in 
(15.31) is even and they are negatives of each other when n is odd. Note 
that there is a factor of 2 in the denominator in (15.25) but not in (15.27); 
this factor accounts for the expression R = 2ci in (15.21). 

Since the classical energy curve consists of two “branches,” of the form 
(x,p(x)) and (x,—p(x)), the compatibility condition (15.31) is equivalent 
to Condition 15.1. Since the phase of the approximate wave function in 
the classically allowed region is given by 1/h times the integral of p dx, 
the condition (15.31) says that the wave function goes through a little 
more than n half-cycles between the two turning points, where a half-cycle 
corresponds to a change in the phase in the amount of 7 r, or the interval 
between two critical points of the wave function. In particular, the wave 
function has exactly n+1 critical points inside the classically allowed region. 
The first and last critical points occur slightly inside the turning points, 
leaving a change in phase of roughly tt/4 between the extreme critical point 
and the turning point. 

Figure 15.4 considers the same potential as in Fig. 15.3. The figure shows 
the WKB functions (15.25) and (15.27), together with the scaled Airy func¬ 
tion (15.26), near the turning point x = a. Note that there is a good match 
between the WKB functions and the scaled Airy function when x is close 
to, but not too close to, the turning point. Meanwhile, Fig. 15.5 then shows 
the full approximate wave function with h chosen so that (15.31) holds 
with n = 39, obtained by using the WKB functions away from the turn¬ 
ing points and the scaled Airy functions near the turning points. Finally, 
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FIGURE 15.4. Plots of the scaled Airy function (thick curve) and the WKB 
functions, near the turning point x = a. 



Fig. 15.6 shows the probability distribution associated to the approximate 
wave function, plotted together with the function \/p{x). (Compare the 
discussion preceding Conclusion 15.5.) 

We now derive the results in Claim 15.7. The Airy function Ai(u) is 
known to have the following asymptotic behavior: 

Ai(u) ~ 2 VtFm1/ 4 6XP {~^ 3/2 } ’ u ^ + °°’ (15.32) 


and 


Ai ( M ) ~ ^(-^1/4 C ° S Q ( ~ M)3/2 ~ \ ) ’ ( 15 - 33 ) 

For u tending to — oo, the asymptotics in (15.33) can be obtained by a 
straightforward application of the “method of stationary phase,” as ex¬ 
plained in Exercise 9. For u tending to +oo, repeated integrations by parts 
(Exercise 8) show that Ai(zi) decays faster than any power of u, which is all 
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FIGURE 15.6. The probability distribution of the approximate wave function, 
plotted against the function 1 /p(x). 

that is strictly required for the main theorem of Sect. 15.6. To obtain the 
precise asymptotics in (15.32), one should deform the contour of integra¬ 
tion to obtain a different integral representation of Ai(u), and then apply 
some variant of the method of stationary phase, such as Laplace’s method 
or the method of steepest descent. See Sect. 4.7 of [30] for one approach to 
this analysis. 

We will use the Airy function on an interval around the turning points 
with a length that goes to zero as h tends to zero (so that the linear 
approximation to the potential gets better and better) but with a length 
that is large compared to H 2 ^ 3 (so that the value of u at the ends of the 
interval will be large, putting us into the asymptotic region of the Airy 
function). See Sect. 15.6 for more information. 

We use the linear approximation V ( x ) « (a — x)Fq to the potential near 
x = a, where Fq = —V'(a), which turns the Schrodinger equation (15.6) 
into the Airy equation, as previously noted. Now, the linear approximation 
to V yields 

p ~ \J t LmF§yJx — a (15.34) 

and 

1 [ x , , , V^Foix-a) 3 / 2 2 . 3/2 

lj^(y)dy a — h -sya-= 3<-“) " ' (15 - 35) 

From here it is a simple matter to check, using (15.33), that 

a ™ G l iv - ?) 

for x > a, where the approximation holds in an intermediate region where 
x is close to a but not too close to a. Thus, if we scale our solution ipi to 
the Schrodinger equation so that it is approximated by 7r 1 / 2 (2mi 7 b/i) -1 / 6 
times Ai(u) near x = a, it should satisfy (15.27) in the classically allowed 
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region (but away from the turning points). It is then straightforward to 
verify, using (15.32), that this multiple of Ai(u) satisfies (15.25) for x near 
—oo. The analysis of if >2 is entirely similar. 

Finally, to compare the approximations (15.27) and (15.28), we note that 


1 

H 


P{y) dy - 


7r 
4 


(y p(y) dy - ^ - (f>, 


where 


1 r b 

( t ) = t j p(y) dy- 7 t / 2 . 


Now, if <f> is an odd multiple of n, then cos(0 — </>) = — cos 9 and if </> is 
an even multiple of w, then cos (9 — </)) = cos 9. For all other values of f> 
(Exercise 4), 008(0 — <j>) is not a constant multiple of cos0. Thus, (15.31) 
is a necessary and sufficient condition for the two approximate solutions to 
agree up to a constant in the classically allowed region. 


15.6 A Rigorous Error Estimate 

The preceding sections give a treatment of the WKB approximation that is 
typical of many books in the literature. This treatment gives the idea that 
energies E satisfying the corrected Bohr-Sommerfeld Condition (Condi¬ 
tion 15.1) should be approximate eigenvalues for the Hamiltonian operator 
77, without specifying the sense in which this approximation holds. In this 
section, we prove a rigorous estimate, as follows. 

Theorem 15.8 For any potential V and range \E\,E^ of energies sat¬ 
isfying Assumption 15.3, there is a constant C such that the following 
holds. For any energy E £ \E\,E-^ satisfying Condition 15.1, there exists 
a nonzero function if belonging to Dom(77) such that 

\\Hif - Eif\\ < Ch 9 ' 8 \\if\\. (15.36) 

As noted already in Sect. 15.3, an estimate of the form \\Hif — Eif>\\ < 
e ||'0|| implies that there is a point E in the spectrum of H with \E — 
E\ < e. (See Exercise 4 in Chap. 10.) Since, under our assumptions on V, 
the spectrum of H is purely discrete, we conclude that for each number 
E £ [EijEzl satisfying Condition 15.1, there is an actual eigenvalue E for 
H with 

\E-E\ < Ch 9 / 8 . (15.37) 

If E satisfies Condition 15.1, then the estimate (15.37) actually holds 
with T, 9 / 8 replaced by h 2 on the right-hand side. It is not, however, pos¬ 
sible to obtain such an optimal estimate by the methods we are using 


15.6 A Rigorous Error Estimate 321 


in this chapter. Specifically, the approximate eigenvector ip constructed 
in the proof of Theorem 15.8 does not satisfy an estimate of the form 
\\Hip — Eip\\ < Ch 2 . One can, however, construct an approximate eigenvec¬ 
tor by different methods- for example, the method in [31]— that satisfies an 
order-?! 2 error estimate, for any E satisfying the corrected Condition 15.1. 
Nevertheless, the error bound in (15.37) is small compared to the typical 
spacing between the energy levels, which is of order Ti. 

Recall, as we noted at the beginning of Sect. 15.4, that a Schrodinger 
operator with potential V that is smooth and tends to +oo at ±oo is 
essentially self-adjoint on C£°(M.). The operator H in Theorem 15.8 is, 
more precisely, the unique self-adjoint extension of the Schrddinger operator 
defined on C7£°(R). 

15.6.1 Preliminaries 

Our construction of the approximate eigenfunction ip will be essentially 
by the WKB approximation as outlined in Claim 15.7. That is to say, 
we will define ip using scaled Airy functions near the turning points and 
by the standard WKB functions in the classically allowed and classically 
forbidden regions. There is, however, a difficulty with this approach, which 
is that at the boundary between different regions, the scaled Airy function 
does not exactly match the WKB functions, but only approximately. What 
this means is that if we define ip by the WKB formula in, say, an interval 
of the form (—oo,a — e) and we define ip by a scaled Airy function on 
(a — e, a + e), then ip may be discontinuous at a — e. Even if we scale ip 
by a constant on one of these intervals to eliminate the discontinuity in ip 
itself, the derivative of ip will still probably be discontinuous. But if the 
derivative of ip is discontinuous, ip is not actually in the domain of H , and 
the left-hand side of (15.36) does not make sense. (Compare Sect. 5.2.) 

The condition that ip' be continuous is not just a technicality: If we 
did not worry about continuity of ip' , then we could always match the 
scaled Airy function to the WKB functions, just by multiplying the various 
functions by constants, regardless of whether or not the energy satisfies the 
corrected Bohr Sommerfeld Condition. In that case, we would be claiming 
that any number E £ [Ei, E^] is within Ch 9 ! 8 of an eigenvalue of H, which 
is false already for the harmonic oscillator. 

To work around the difficulty described in the previous paragraphs, we 
must put in a transition region over which we smoothly pass from one func¬ 
tion to the other, using the “join” construction described in Sect. 15.6.4. 
Thus, we define the function ip in Theorem 15.8 as follows. We use the 
formulas in Claim 15.7 in the indicated intervals, except that multiply 
the functions (15.28), (15.29), and (15.30) by —1 when n is odd. We use 
the scaled Airy functions (15.26) and (15.29) on intervals of the form 
(a — e, a + e) and (b — e, b + e), respectively, for some e depending on h in a 
manner to be determined later. We then put in four transition regions, each 
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a-e-6 a-e 



FIGURE 15.7. The approximate eigenfunction ip, with the transition regions 
shaded. 

having length S, where S also depends on h in a manner to be determined 
later. The first transition region, for example, is the interval (a — e — (5, a — e) 
between the first classically forbidden region and the first turning point. 
In each transition region, we change over smoothly from one function to 
another. See Fig. 15.7 for an illustration of the transition regions around 
the turning point x = a. 

Suppose Hq denotes the Schrodinger operator with potential V, with 
domain equal to C£°(R). Then, as we have noted, Hq is essentially self- 
adjoint, and we are letting H , which coincides with the adjoint operator 
Hq, denote the unique self-adjoint extension of Hq. Now, the domain of 
Hq consists of all functions ip £ L 2 (R) such that the Schrodinger operator, 
computed in the distributional sense, again belongs to L 2 (R). In particular, 
if ip is smooth, then ip belongs to the domain of H = Hq if and only if ip 
is in L 2 (R) and —{h 2 /2m)ip" + Vip is also in L 2 (R). 

Because of the joins, our approximate eigenfunction is ip actually in¬ 
finitely differentiable on all of R. And since V ( x ) tends to +00 at ± 00 , 
the exponential WKB functions (15.25) and (15.30) have rapid decay at 
infinity, which shows that ip is in L 2 (R). Furthermore, for x near ± 00 , the 
calculation (15.17) applies, with A(x) = Cq(x)~ 1 ^ 2 . We obtain, after a 
short calculation, 



(15.38) 


Since V'jV and V"/V are assumed to be bounded near infinity and 1 p{x) 
tends to +00 at ± 00 , we see that the Schrodinger operator applied to ip is 
bounded by a constant times ip near infinity and is thus square integrable. 
This shows that ip is in the domain H. 

In Sect. 15.6.2, we will take the width 2e of the region around the turning 
points to be of order h 1 ^ 2 . In that case, the L 2 norm of our approximate 
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wave function is of order 1 (bounded and bounded away from zero) as h 
tends to zero, despite the blow-up of order h~ l F very near the turning 
points. Although this result is not hard to verify (Exercise 10), if anything, 
the norm would be blowing up as h tends to zero, which would only help 
us in showing that || Hip — Eip || is small compared to ||^|| ■ 

To prove Theorem 15.8, we must estimate the contributions to the quan¬ 
tity || Hip — Eip || from four different types of regions: the classically allowed 
region, the classically forbidden regions, the regions near the turning points, 
and the transition regions. These estimates will occupy the remainder of 
this section, with the analysis in the transition regions being the most in¬ 
volved. In particular, it is essential that the derivative of scaled Airy func¬ 
tion almost match the derivative of the WKB function in the transition 
region, as in the second part of Lemma 15.9. 

15.6.2 The Regions Near the Turning Points 

We use a scaled Airy function in an interval around each turning point. 
[We use (15.26) near x = a and either (15.29) or the negative thereof near 
x = b, depending on whether n is even or odd.] We now verify that taking 
these intervals to have length of order H 1 ! 2 will give satisfactory estimates. 
If ip denotes one of the scaled Airy functions, then ip satisfies a Schrodinger 
equation in which the potential V is replaced by a linear approximation V 
near one of the turning points, which means that 

Hip — Eip = ( V(x ) - V(x))ip. (15.39) 

The difference between V ( x ) and its linear approximation V (x) grows at 
most quadratically with the distance from the turning point. Meanwhile, 
the asymptotics of the Airy function tell us that it can be bounded as 
|Ai(u)| < Cu~ l P. (This is terrible estimate for small u , but still true.) 
Now u, as defined in (15.22), is of order h~ 2 F times the distance to the 
turning point. Since, also, there is factor of hr 1 ! 3 in (15.26) and the distance 
from the turning point is at most of order H 1/l2 , we find that 

\Hip - Eip\ < C{h 1 / 2 ) 2 h- 1 / 6 (h- 2 / 3 h 1/2 )- 1/4 = Ch 7/8 

over the interval around each turning point. Finally, if a function / satisfies 
|/| < D on an interval of length L, then the L 2 norm of / over that interval 
will be at most D\/L. Thus, over the interval around the turning points, 

|| Hip -Eip || = o{h 7 ' 8 n 1,A ) = o(h 9/8 ). 

15.6.3 The Classically Allowed and Classically Forbidden 
Regions 

The expression (15.38) for Hip — Eip, derived from (15.17), applies both in 
the classically allowed region and in the classically forbidden regions. Let us 
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consider first the classically allowed region. Although (15.38) is nominally 
of order ft 2 , we use this expression on an interval whose ends get closer and 
closer to the turning point as ft tends to zero. Since, also, the expression 
in (15.38) is blowing up at the turning points, the contribution to \\Hip — 
Eip || from this interval is of order larger than h 2 . 

We have taken the interval around the turning point to have length 2e 
that is of order ft 1 / 2 , and we will also take (Sect. 15.6.4) the transition 
regions to have length S that is of order ft, 1 / 2 . Thus, we use the oscillatory 
WKB function on an interval of the form (a + 7 , b — 7 ), where 7 = e + S is 
of order ft 1 / 2 . Now, the formula for ip in the classically allowed regions has 
a factor of l/i Jp{x) times a bounded quantity (the cosine factor). Since 
V'(a) is assumed to be nonzero, V{x) — E behaves like a constant times 
(x — a) and so 1 /-\Jp(x) behaves like a constant time (x — a ) -1 / 4 for x 
approaching a, with similar behavior near the other turning point. 

Meanwhile, the more problematic term in (15.38) is the term having 
iy{x) — E) 2 in the denominator. Keeping in mind the 1 /blowup of ip 
itself, this term behaves like ( x — a ) -9 / 4 as x approaches a. Thus, we may 
estimate the norm of Hip — Eip over the left half of the classically allowed 
region as 


\\Hip-Eip\\ <CH 2 


( ra +7 

/ (x - a) 

J (a+6)/2 


-„ i - 9 / 2 



1/2 


= C"ft 2 ( 7" 7/2 - ((a + 6 )/ 2 ) 7 / 2 ) 1 / 2 . 


Since 7 is of order ft 1 / 2 , the contribution to || Hip — Eip\\ from the interval 
(a + 7 , (a + b)/ 2 ) will consist of a term of order ft 2 ft -7 / 8 = ft 9 / 8 , plus lower- 
order terms. The estimate over the other half of the classically allowed 
region is similar. 

Meanwhile, in the first classically forbidden region, we also apply (15.38). 
By Assumption 15.3, V'/V and V"/V are bounded near infinity. Thus, 
V'/(V — E) and V"/(V — E) will also be bounded near infinity, and thus 
also bounded on (— 00 , a— 1), since V — E is strictly positive on this interval 
and tends to +00 as x tends to — 00 . We see, then, that the norm of Hip—Eip 
over (— 00 , a — 1) is bounded by a constant times ft 2 ||?/>[|. 

The norm of Hip — Eip over an interval of the form (a — 1, a — 7 ) can be 
analyzed similarly to the classically allowed region. The estimates from this 
region are better, however, because of the exponentially decaying factor in 
the definition of the WKB function. Thus, the contribution to \\Hip — Eip || 
from the classically forbidden region (— 00 , a — 7 ) is certainly no larger than 
order ft 9 / 8 , and similarly for the other classically forbidden region. 
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FIGURE 15.8. The join of two functions over the interval [a, a + 5] ( thick curve). 

15.6.4 The Transition Regions 

Given two smooth functions if’ 1 and 1^2 and some interval of the form 
[a, a + (5], we now define a “join” ipi LI ip 2 of ipi and ip 2 , where ipi U 1^2 (x) 
is equal to ipi (x) for x < a and equal to ip 2 (x) for x > a + <5, and where 
if) 1 U ijj ‘2 is smooth everywhere. Let x be a smooth function on [0,1] that is 
identically equal to 0 in a neighborhood of 0 and identically equal to 1 in 
a neighborhood of 1. Then define ipi U ip 2 by 

C0i U ip 2 )(x) = V’i(x) + (il> 2 (x) - ipi(x))x((x - a)/5). 

(See Fig. 15.8.) By direct calculation, we have 

(H - EI)(ipi U V' 2 ) = (Hip 1 - Eipi) U (Hip 2 - Eip 2 ) 

- t — W 2 (x) - ip[(a’))x'((x - a)/8) 

0 m 

- ^2 - Mx))x"((x - a)/8). (15.40) 

In our constructing our approximate eigenfunction, we use five different 
formulas in five different regions: the two classically forbidden regions, the 
classically allowed region, and the regions near the two turning points. Since 
none of these functions exactly matches the function in the next interval, 
we put in a total of four joins in order to produce a function that is in the 
domain of H. We choose the width 8 of the interval on which the join takes 
place to be of the same size as the intervals around the turning points, 
namely, order h 1 / 2 . 

The most critical case is the transition from the region near the turning 
points to the classically allowed region. Consider, for example, the scaled 
Airy function ipi in (15.26) and the oscillatory WKB function ip 2 in (15.27). 
There are two contributions to the mismatch between these two functions. 
First, there is a discrepancy between the Airy function and its leading- 
order asymptotics. Second, there is an error in the approximations (15.34) 
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and (15.35), which come from the discrepancy between the potential V(x) 
and its linear approximation V ( x ) near x = a. We need to consider both 
contributions to the mismatch in our estimation of if\ — 1(2 and of V’r — V4 ■ 

Lemma 15.9 Let if\ denote the scaled Airy function in (15.26), let ifi 
denote the same function with the Airy function replaced by the right-hand 
side of (15.33), and letif 2 denote the oscillatory WKB function in (15.27). 
If x — a is positive and of order h 1 ^ 2 , we have 

\Mx)~Mx)\ = 0(h 1/8 ) 

U’i{x) - i>z(x)\ = 0(H 1/8 ) 


and 


Wii x ) = 0(h 5/8 ) 

W\{x) -ip 2 (x)\ = 0(h~ 5/8 ). 


Before giving the proof of this lemma, let us verify that these estimates 
are sufficient to control the contribution to || Hip — Eip || from the transition 
region (a + e, a + e + <5) between the first turning point and the classically 
allowed region, where both e and <5 are taken to be of order h 1 / 2 . We must 
consider each of the three lines in (15.40). The L 2 norm of the first line is 
of order at most h 9 / 8 , by precisely the same argument as in Sect. 15.6.3. 

For the second and third lines, we recall that if a function / is bounded 
by C, then the L 2 norm of / over an interval of length L is at most C\[L. 
Since we are taking the length S of our transition interval to be of order 
hf ! 2 , the L 2 norm of the second line of (15.40) is of order 


1 

W 2 


h 2 h - 5 /8 h llA = n 9/8_ 


Meanwhile, the contribution from the third line of (15.40) is of order 


-h 2 hfi 8 h x i A = h lx ' 8 . 
h 


Thus, the contribution to \\Hip — Eip\\ from the transition region (a + e,a + 
e + 5) is of order at most h 9 / 8 . 

The analysis of the transition between the classically allowed region and 
the region around x = b is entirely similar. The analysis of the transitions 
between the regions near the turning points and the classically forbidden 
regions is also similar, but much less delicate, because all of the functions 
involved are very small in the transition region. When (a — x) is positive 
and of order H 1 ^ 2 , for example, u, as defined in (15.22) will be of order h~ x ^ 6 
and so v?! 2 is of order h~ x ^ A . Thus, the exponential factor in leading-order 
asymptotics of the Airy function for u > 0 will behave like exp(— Chr 1 ^), 
which is very small for small h, certainly smaller than any power of h. Since 
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all the factors in front of the exponential will behave like h to a power, the 
overall contribution to \\Hip — Eip || from the transition between the region 
near the turning points and the classically forbidden region is smaller than 
any power of h. Thus, none of the transition regions contributes an error 
worse that 0(h 9 / 8 ). 

Proof of Lemma 15.9. We consider only the estimates for the derivatives 
of the functions involved. The analysis of the functions themselves is similar 
(but easier) and is left as an exercise to the reader (Exercise 11). 

We begin by considering ip[ — ^. With a little algebra, we compute that 

= —y/n(2mFo) 1 ^ 6 h~ 5 ^ 6 (Ai 1 ( u ) — Ai (u)) (15.41) 

ax ax 

where u is as in (15.22) and where Ai is the function on the right-hand side 
of (15.33). 

Now, Ai(u) has an asymptotic expansion for u —» —oo given by 
Ai(u) = Ai(u)(l + Cu~ 3 / 2 H-), 

and Ai' (u) has the asymptotic expansion obtained by formally differenti¬ 
ating this with respect to u. [See Eq. (7.64) in [30].] From this, we obtain 

Ai'(u) — Ai ( u ) = Ai (u)0((—u)~ 3 / 2 ) + Ai (u)0((—u)~ 5 ^ 2 ). (15.42) 

From the explicit formula for Ai, we see that Ai(u) is of order (—it) -1 / 4 . 

Meanwhile, the formula Ai ( u ) will contain two terms, the larger of which 
will be of order u 1 / 4 . Thus, the slower-decaying term on the right-hand side 
of (15.42) is the first one, which is of order (—m) - 5 / 4 . Now, in the transition 
regions, u behaves like h~ 2 ^ 3 h 1 ^' 2 = hr 1 ^. Thus, (15.42) goes like h 5 / 24 and 
so (15.41) goes like }%- 5 / e + 5 / 24 = h~ 5 / 8 , as claimed. 

We now consider — ip' 2 . By direct calculation, the derivatives of ipi 
and i />2 each consist of two terms, a “dominant” obtained by differentiating 
the cosine factor and a “subdominant” term obtained by differentiating the 
coefficient of the cosine factor. In the case of the dominant term in the 
derivative may be simplified to 

- ^((2toF 0 )(z- <x)) 1/4 sin Q(-«) 3/2 - ^ ■ (15.43) 

According to Exercise 12, we have, when x — a is of order H 1 / 2 , the 
estimates 

((2mFo)(a - x)) l/A = ^p + yjpO{H 1/2 ) (15.44) 

and 

|(- u ) 3/2 = j i [ P (y)dy + 0(h 1/4 ). 


(15.45) 
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Since the derivative of sin$ is bounded, a change of order h 1 / 4 in the 
argument of a sine function produces a change of order h x ! 4 in the value 
of the sine. Thus, if we substitute (15.44) and (15.45) into (15.43), we find 
that the difference between the dominant term in fy and the dominant 
term in is 


1 

h 


y/pOffi 1 / 4 ) + lower-order terms. 


Since yfp is of order (x — a) 1 / 4 or H 1//8 , we get an error of order h~ 5 / 8 , as 
claimed. 

Finally, the subdominant terms in the derivatives of if )i and ip2 are easily 
seen to be separately of order h ~ 5 / 8 . Thus, even without taking into account 
the cancellation between these terms, they do not change the order of the 
estimate. ■ 


15.6.5 Proof of the Main Theorem 

We have estimated the contributions to \\Hip — Eip || from each type of 
region: classically allowed and classically forbidden regions, the regions 
around the turning points, and the transition regions. In each case, we have 
found a contribution that is of order at most h 9 P ||i/>|| . Thus, it remains 
only to verify that the constants in all estimates are bounded uniformly 
over the given range E\ < E < E 2 of energies. 

This verification is straightforward. Near the turning point x = a, for 
example, we need to estimate the difference between the potential V(x) 
and its linear approximation V ( x ) near x = a. As a consequence of the 
Taylor remainder formula, | W(cc) — V(a;)| will be bounded by C |x — o| /2, 
where C is the maximum of |V"(x)| over the interval from a to x. As E 
varies over [Ei,Ef\, the set of points where we have to evaluate |V 7 , (a;)| 
will be bounded, meaning that C can be taken to be independent of E 1 for 
E in such a range. 

Similarly, in the classically allowed region, the blow-up of l/{V(x) — E ) 2 
near x = a(E) can be controlled by the minimum of |W(y)| for y between a 
and x. By assumption, |W(a:)| > 0 at all the turning points a{E) and b(E) 
with E 1 < E < E2, and thus, by continuity, in some neighborhood of that 
set of turning points. Thus, blow-up of l/(V(x) — E ) 2 will be controlled by 
the minimum of |W(a;)| on an interval of the form \a{ET) + o ., a(E\) + a\ 
for some small a > 0. The remaining details of this verification are left to 
the reader. 


15.7 Other Approaches 

The main complicating factor in the WKB approximation is the singular 
behavior near the turning points. The turning points, meanwhile, are only 
problematic because we are working in the position representation. The 
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turning points, after all, are the points on the classical trajectory where 
the position of the particle achieves a maximum or a minimum. If we were 
to work in the momentum representation, the points where the momen¬ 
tum achieves a maximum or a minimum would instead be the problematic 
points. A. Voros [42] has proposed working in the Segal- Bargmann repre¬ 
sentation (Sect. 14.4). In Voros’s analysis, there are no turning points and, 
thus, the analysis is much simpler. The problem with Voros’s approach is 
that he only gives an approximation to the wave function on the classical 
energy curve. Even in simple cases, Voros’s expression does not admit a 
holomorphic extension to the whole plane, but has branching behavior in¬ 
side the classical energy curve. Thus, Voros’s formula does not define an 
element of the quantum Hilbert space (which is a space of entire holomor¬ 
phic functions), let alone an element of the domain of the Hamiltonian. 

Nevertheless, it is possible to build approximate eigenfunctions as su¬ 
perpositions of coherent states, using formulas similar to those in Voros. 
This approach avoids dealing with turning points but still yields a rigorous 
eigenvalue estimate, with the same corrected Bohr-Sommerfeld condition 
as in Condition 15.1. See [31, 23, 7], or (in greater generality) [26]. 


15.8 Exercises 

1. Show that if C\ is any complex number, then we have an identity of 
the form 

c\e ie + c\er ld = R. cos[Q — <5) 
for some real numbers R and 5. 

2. Let H[x,p) = p 2 /2m + muj 2 x 2 /2 be the Hamiltonian for a harmonic 
oscillator having mass m and classical frequency w. Show that a pos¬ 
itive number E satisfies the corrected Bohr-Sommerfeld condition 
(Condition 15.1) if and only if E is of the form ( n+ l/2)huj, where n 
is a non-negative integer. 

Note: In light of the results of Chap. 11, this calculation means that, 
in this very special case, the corrected Bohr-Sommerfeld condition 
gives the exact eigenvalues of the quantum Hamiltonian H. 

3. Suppose A and p are two nonzero, smooth functions satisfying (15.15). 
Show that A(x) = C(p(x))~ 1 / 2 for some constant C. 

Hint : Think in terms of the logarithms of the functions involved. 

4. Show that cos(0 — S ), viewed as a function of 0 1 agrees, up to mul¬ 
tiplication by a constant, with cos [6 — S') if and only if S — V is an 
integer multiple of n. 
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5. If ip is an eigenvector for H that is approximated by (15.25) near 
—oo, one might hope to find an approximate expression for ip in 
the classically allowed region by analytically continuing around the 
turning point in the complex plane. Even assuming V is analytic, 
however, it is fairly evident that analytic continuation in the upper 
half-plane does not give the same answer as in the lower half-planes. 
Nevertheless, one could use the average of the upper and lower half¬ 
plane results as a (totally nonrigorous) guess for the behavior of ip in 
the classically allowed region. 

Show that the above approach gives the correct phase 8 in the con¬ 
nection formula (15.21) but is off by a factor of 2 in the amplitude R. 

6 . Using integration by parts, show that the limit 

lim / cos ( 77 + ut) dt 
A— >+oo J q \ 3 / 


exists. 

Hint: Multiply and divide by t 2 +u (avoiding points where t 2 + u = 0 
in the case u < 0). 

7. In this exercise, we sketch an argument that the Airy function in 
(15.24) satisfies the differential equation ip"(u) — uip(u ) = 0. For 
the purposes of this exercise, let us say that / 0 °° f(t) dt = C if 
J 0 4 f(t) dt = C+g(A), where the function g is bounded and oscillates 
around an average value of zero. 

Assuming that it is legal to differentiate under the integral sign, verify 
that Ai(u) satisfies the stated equation. 

Hint: After differentiating under the integral, look for a term that 
can be integrated explicitly. 

Note: A more rigorous approach to this verification would be to in¬ 
tegrate by parts as in Exercise 6 and then differentiate under the 
integral. This approach is, however, a bit messier. 

8 . By integrating by parts repeatedly in (15.24), show that Ai(u) decays 
faster than any power of u as u tends to +oo. 

Hint: A key point is to show that the boundary terms in the integra¬ 
tion by parts vanish at every stage. After performing the integrations 
by parts, estimate the resulting integral by using the inequality 

1 „ 1 1 ^ , 

(t 2 +u) n < (t 2 + l) fc u n ~ k ' U> ’ 

for some appropriate choice of k. 
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9. (a) For u < 0, make the change-of-variable r = t/y/—u in the 

integral formula for the Airy function, to obtain the expression 


Ai(u) = 



dr, 


(15.46) 


where a = (— u) 3 / 2 . 

(b) Suppose / is a smooth function on [a, 5] having a unique critical 
point xq. Assuming that Xo is in the interior of [a, b] and that 
f"(x o) ^ 0, the method of stationary phase asserts that 


jf g{x)e iaf{x) dx = g{x 0 )e ianxo) e ±i7r/4 ^J 


27T 


a \f'{ x o)\ 


o\- 

a 


for a tending to +oo, where the plus sign in the exponent is taken 
when f”(x o) > 0 and the minus sign is taken when f"(xo) < 0. 
(See, e.g., Eq. (5.12) in [30].) 

Using this result, obtain the asymptotic formula (15.33). 

Hint: Divide the integral in (15.46) into an integral over [0,2] and an 
integral over [2, oo). Use stationary phase for the first interval and 
integration by parts (as in Exercise 6) for the second interval. 

10. Let ip be the approximate eigenfunction for H defined in the begin¬ 
ning of Sect. 15.6. Show that the norm of ip is bounded and bounded 
away from zero as h tends to zero. 

Hint: First show that the L 2 norm of ip over the intervals around 
the turning points goes like lir 1 / 6 /! 1 / 4 . Then check that the functions 
p{x )^U 2 and q{x) -1 / 2 are square integrable near the turning points. 


11. By imitating the arguments in the proof of Lemma 15.9, prove the 
estimates for ipi — ip\ and ipi — ip 2 in the lemma. 

12. By writing V (x) as Fq (a—x) plus an error term of order (x— a) 2 , verify 
that the estimates (15.44) and (15.45) in the proof of Lemma 15.9 
hold in the transition region. (Assume that x — a is of order h 1 ^ 2 in 
the transition region.) 

Hint: The leading-order Taylor expansion of (1 + z) a is 1 + az + 0(z 2 ), 
for any real number a. 





16 

Lie Groups, Lie Algebras, and 
Representations 


An important concept in physics is that of symmetry , whether it be 
rotational symmetry for many physical systems or Lorentz symmetry in 
relativistic systems. In many cases, the group of symmetries of a system is 
a continuous group , that is, a group that is parameterized by one or more 
real parameters. More precisely, the symmetry group is often a Lie group , 
that is, a smooth manifold endowed with a group structure in such a way 
that operations of inversion and group multiplication are smooth. The tan¬ 
gent space at the identity in a Lie group has a natural “bracket” operation 
that makes the tangent space into a Lie algebra. The Lie algebra of a Lie 
group encodes many of the properties of the Lie group, and yet the Lie 
algebra is easier to work with because it is a linear space. 

In quantum mechanics, the way symmetry is encoded is usually through 
a unitary action of the group on the relevant Hilbert space. That is, we 
assume we are given a unitary representation of the relevant symmetry 
group G, that is, a continuous homomorphism of G into U(H), the group 
of unitary operators on the quantum Hilbert space H. Actually, since two 
unit vectors in H that differ only by a constant represent the same physi¬ 
cal state, we should more properly consider projective unitary representa¬ 
tions. A projective representation is a homomorphism of a group G into 
U(H)/U(1), where U(l) is the group of complex numbers of magnitude 1, 
thought of multiples of I in U(H). An ordinary or projective representa¬ 
tion of a Lie group gives rise to an ordinary or projective representation 
of its Lie algebra. The angular momentum operators, for example, form a 
representation of the Lie algebra of the rotation group. 
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Saying that, for example, the Hamiltonian operator of a quantum system 
is invariant under rotations means that H commutes with the relevant 
representation of the rotation group and thus also with the associated Lie 
algebra operators. This commutativity, in turn, implies that the eigenspaces 
for H are invariant under rotations. We will use this commutativity in 
Chap. 18 to help us in determining the energy eigenvectors for the hydrogen 
atom. 

In this chapter, we will make a brief survey of Lie groups, Lie algebras, 
and their representations. For our purposes, it suffices to consider matrix 
Lie groups , those that can be realized as closed subgroups of the group of 
n x n invertible matrices. Inevitably, I have had to present some of the 
deeper results without proof. Proofs of all results stated here can be found 
in [21]. The results of this chapter will be put to use in Chap. 17, in our 
study of angular momentum, and in Chap. 18, in our study of the hydrogen 
atom. 


16.1 Summary 

In this chapter, we will consider a matrix Lie group G, which is, by defini¬ 
tion, a (topologically) closed subgroup of some GL(n;C), where GL(n;C) is 
the group ofnxn invertible matrices with complex entries. To each such 
G, we will associate the Lie algebra g of G, where g is a real subspace of 
M„(C), the space of all n x n matrices. We will see that G is automatically 
an embedded real submanifold of M n ( C) and that g is the tangent space 
of G at the identity matrix. 

Now, g is not just a real vector space, but comes with a “bracket” opera¬ 
tion mapping gxg into g. Specifically, we will show that for all X and Y in 
g, the matrix XY — YX belongs again to g. Thus, we define our bracket by 
setting [X, Y] equal to XY — YX. As it turns out, the Lie algebra g, as a 
vector space with the bracket operation, encodes a lot of information about 
the group G. On the other hand, computing at the level of the Lie algebra 
is generally easier than computing at the group level, simply because g is 
a linear space. 

We will be interested in unitary representations of our group G, that is, 
continuous homomorphisms of G into U (H) , the group of unitary operators 
on a Hilbert space. If we restrict attention, at first, to the case in which 
H is finite dimensional, then each representation n of G gives rise to a 
representation tv of the Lie algebra g of G. That is to say, tv is a linear 
map of g into the space of linear maps of V to V, satisfying 7r([X, F]) = 
[7r(X),7r(F)]. A deeper question is whether every representation tv of g 
comes from a representation n of G. As it turns out, the answer in general 
is no, but the answer is yes if G is simply connected. 
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We may consider, for example, the case G = SO(3). This group is not 
simply connected. On the other hand, the Lie algebra so(3) of S0(3) is iso¬ 
morphic to the Lie algebra su(2) of SU(2), and SU(2) is simply connected. 
[That is, SU(2) is the “universal cover” of S0(3).] Thus, given a represen¬ 
tation 7r of so(3), there may or may not be an associated representation II 
of S0(3). Even if there is not, however, there is always a representation IT 
of the group SU(2). 

In quantum mechanics, the vector e l6 if represents the same physical 
state as if>. Thus, it is natural to consider “projective” unitary representa¬ 
tions, that is, homomorphisms of G into the quotient group U(H )/{e l6 I}. 
In the finite-dimensional case, each projective representation can be “de- 
pro jectivized” at the level of the Lie algebra 0 of G. We can then pass 
from the Lie algebra to the universal cover of G, that is, the simply con¬ 
nected group with Lie algebra 0 . In particular, in the finite-dimensional 
case, the irreducible projective unitary representations of SO(3) are in one- 
to-one correspondence with irreducible ordinary unitary representations of 
the universal cover SU(2) of S0(3). Although the Hilbert spaces of phys¬ 
ical systems are usually infinite dimensional, for compact groups such as 
S0(3), general unitary representations can be decomposed as direct sums 
of finite-dimensional ones. (See, e.g., Proposition 17.19 and the discussion 
following it.) 

16.2 Matrix Lie Groups 

Let M n ( C) denote the space of n x n matrices with complex entries. We 
identify M n { C) with C” 2 , equipped with the usual topology. Thus, a se¬ 
quence A m in M n ( C) converges to a matrix A £ M n ( C) if ( A m )jk converges 
to Ajk as m tends to infinity, for all 1 < j, k < n. Let GL(n; C) denote the 
general linear group, consisting of all invertible n x n matrices with com¬ 
plex entries. Then GL(n;C) forms a group under the operation of matrix 
multiplication. Furthermore, GL(n;C)—that is, the set of A £ M n (C) with 
det A ^ 0—is an open subset of M n ( C). Since M n (C) is a complex vector 
space of dimension n 2 , it may be identified with C n = R 2n . Since GL(n; C) 
is an open subset of M n { C), it looks locally like R 2rl and is therefore a real 
manifold of dimension 2 n 2 . 

Definition 16.1 A subgroup G o/GL(n;C) is closed if for each sequence 
A m in G that converges to a matrix A, either A is again in G or A is not 
invertible. A matrix Lie group is a closed subgroup of some GL(n;C). 

A subgroup G of GL(n; C) is closed if it is topologically closed as a subset 
of GL(n;C)—but not necessarily as a subset of M n { C). We will see that 
each matrix Lie group is a real embedded submanifold of GL(n; C) and thus 
is a Lie group. 
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Definition 16.2 If G\ and G 2 are matrix Lie groups, then a Lie group 
homomorphism of G\ to G 2 is a continuous group homomorphism of G 1 
into G 2 - A Lie group homomorphism is called a Lie group isomorphism 
if it is one-to-one and onto with continuous inverse. Two matrix Lie groups 
are called isomorphic if there exists a Lie group isomorphism between 
them. 

Example 16.3 The real general linear group, denoted GL(n,R), is the 
group of invertible n x n matrices with real entries. The groups SL(n,C) 
and SL(n,R) are, respectively, the groups of complex and real matrices with 
determinant 1. They are called the special linear groups. 

Example 16.4 An n x n matrix U £ M n (C) is said to be unitary if 
U*U = UU* = I. A matrix U is unitary if and only if 

(Uv, Uw) = ( v , w) 

for all v, w £ C n . The group of unitary matrices is denoted U(n) and called 
the (n x n) unitary group. The special unitary group, denoted SU(n), 
is the subgroup ofU(n) consisting of unitary matrices with determinant 1. 

The condition ( U*U)jk = 5jk is equivalent to the condition that the 
columns of U form an orthonormal set in C", as can be seen by direct 
computation. Geometrically, the condition U*U = I is equivalent to the 
condition that (fUv\,Uvf) = {v\,vz} for all V\,V 2 £ C", i.e., that U pre¬ 
serves the inner product on C n . By taking the determinant of the condition 
U*U — I, we see that |det U\ = 1 for all U £ U(n). 

In this, the finite-dimensional case, the condition U*U = I implies that 
U* is the inverse of U and thus that UU* = I. This result does not hold 
in the infinite-dimensional case. 

Example 16.5 An n x n real matrix R £ M„(R) is said to be orthogonal 
if R tr R = RR tr = I. A matrix R is orthogonal if and only if 

(Rv, Rw) = (v, w) 

for all v,w £ R". The group of orthogonal matrices is denoted 0(n) and 
is called the (nx n) orthogonal group. The special orthogonal group, 
denoted SO (n), is the subgroup of 0(n) consisting of orthogonal matrices 
with determinant 1. 

As in the unitary case, the condition R tr R = I implies that RR tr = I 
and that the columns of R form an orthonormal set in R ra . Geometrically, 
a real matrix R is in 0 (n) if and only if {Rv\,Rv 2 ) = (^ 1 ,^ 2 ) for all 
V\,V 2 £ R", i.e., if and only if R preserves the inner product on R". By 
taking the determinant of the condition R tr R = I we see that detl? = ±1 
for all R £ 0 (n). 

It is easy to verify that all the groups in Examples 16.3, 16.4, and 16.5 
are, indeed, subgroups of GL(n, C) and that they are closed. 
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Definition 16.6 A matrix Lie group G is connected if for all A,B € G 
there is a continuous path A : [0,1] — > M n (C ) such that A(0) = A and 
A(l) = B and such that Aft) lies in G for all t. A matrix Lie group G is 
simply connected if it is connected and every continuous loop in G can 
be shrunk continuously to a point in G. A matrix Lie group G is compact 
if it is compact as a subset of M n { C) = K 2rl . 

By the Heine-Borel theorem (e.g., Proposition 0.26 of [12]), a matrix 
Lie group G is compact if and only if it is a closed and bounded subset 
of M„(C). The condition we are calling “connected” is, more properly, the 
condition of being path connected. We will see, however, that each matrix 
Lie group is an embedded real submanifold of M n { C) and is, therefore, 
locally path connected. For matrix Lie groups, then, connectedness and 
path connectedness are equivalent. 

To prove that a matrix Lie group G is connected, it suffices to prove that 
for all 4gG, there is a continuous path in G connecting A to /. After all, 
if both A and B can be connected to I, then they can be connected to each 
other. 


Example 16.7 The groups 0 (n), SO(n), U(n), and SU(n) are compact. 


Proof. The conditions defining these groups are obtained by setting certain 
continuous functions equal to a constant. The group SU(n), for example, is 
defined by setting ( U*U)jk = Sjk for each j and k and by setting det U = 1. 
These groups are thus closed not just as subsets of GL(n;C) but also as 
subsets of M„(C). Furthermore, each of these groups has the property that 
each column of any matrix in the group is a unit vector. Thus, each group 
is a bounded subset of M n ( C). ■ 


Example 16.8 The group U(n) is connected. 


Proof. If U € M n ( C) is unitary, then U has an orthonormal basis of 
eigenvectors with eigenvalues of absolute value 1. Thus, there is another 
unitary matrix V (the change of basis matrix) such that 


U = V 


n i 6 1 


o *$2 


\ 


V 


-1 


y e l9n J 

for some real numbers 81 , 62 , ■■■ , 6 n . Thus, we can define a family U{t) of 
unitary matrices by setting 


/ e Ml 


U(t) = V 


„ ite 2 


V~\ 


gitu r 1 J 


V 
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Then £/(•) is a continuous path lying in U(n) with U( 0) = I and U{ 1) = U. 


Example 16.9 The group SU(2) is simply connected. 
Proof. We claim that 


SU(2) 



«,/?£ C, \a\ 2 + |/3| 2 



It is easy to see that each matrix of the indicated form is indeed unitary and 
has determinant 1. On the other hand, if U is any element of SU(2), then 
the first column of U is a unit vector (a, (3) £ C 2 . The second column of 
U must then be orthogonal to (a, /?). Since (— a) is orthogonal to (a,/3) 
and C 2 is 2-dimensional, the second column of U must be a multiple of 
(—/3,d). But the only multiple that produces a matrix with determinant 
1 is 1. 

We see, then, that SU(2) is, topologically, the unit sphere S 3 inside C 2 = 
R 4 and is, therefore, simply connected. ■ 


16.3 Lie Algebras 

We now introduce the general algebraic concept of a Lie algebra. Once this 
is done, we will show how to associate a real Lie algebra with an arbitrary 
matrix Lie group. 

Definition 16.10 A Lie algebra over a field F is a vector space g over 
F, together with a “bracket” map [•,•]: g x g —»• g having the following 
properties: 

1. [-,■] is bilinear 

2. [Y, X} = - [X , Y] for all X,Y € g 

3. [X, X] = 0 for all leg 

4- For all X,Y,Z£g we have the Jacobi identity 

[X, [Y, Z]J + [Y, [Z, X]] + [Z, [X, Y]] = 0. 

If the characteristic of F is not equal to 2, then Property 3 is a conse¬ 
quence of Property 2. If F = R, then we say that g is a real Lie algebra. An 
example of a real Lie algebra is the vector space R 3 with the bracket equal 
to the cross product. Properties 1, 2, and 3 are evident from the definition 
of the cross product, while the Jacobi identity is a known property of the 
cross product that can be verified by direct calculation. 

A large class of Lie algebras may be obtained by the following procedure. 
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Example 16.11 Let A be an associative algebra and let g be a subspace of 
A with the property that for all x,y in g, xy — yx is again in g. Then the 
bracket 

[x, y] := xy - yx 

makes g into a Lie algebra. 

In Example 16.11, we may take, for example, g = A. It is evident that 
this bracket satisfies Properties 1, 2, and 3 of a Lie algebra, and the Ja¬ 
cobi identity is easily verified by direct calculation. As it turns out, every 
Lie algebra is isomorphic to a Lie algebra of this type. (This claim is a 
consequence of the Poincare-Birkhoff-Witt theorem, which is proved, for 
example, in Sect. 5.2 of [25]. The algebra A in the Poincare-Birkhoff-Witt 
theorem is the so-called universal enveloping algebra of g.) 

Definition 16.12 If gi and g 2 are Lie algebras, a map <p : gi —> q 2 is 
called a Lie algebra homomorphism if (j> is linear and (f> satisfies 

<K[X,Y]) = [4>(X),<t>(Y)] 

for all X, Y £ gi- A Lie algebra homomorphism is called a Lie algebra 
isomorphism if it is one-to-one and onto. 

Definition 16.13 If g is a Lie algebra, a subalgebra of g is a subspace f) 
of g with the property that [X, Y] £ f) for all X and Y in b- An ideal in g 
is a subalgebra t) of g with the stronger property that [A', Y\ £ [) for all X 
in g and Y in h- 

The notion of a subalgebra of a Lie algebra is analogous to the notion 
of a subgroup of a group, while the notion of an ideal in a Lie algebra is 
analogous to the notion of a normal subgroup of a group. In particular, 
the kernel of any Lie algebra homomorphism is an ideal, just as the kernel 
of a group homomorphism is a normal subgroup. 

Definition 16.14 The direct sum of Lie algebras gi and g 2 , denoted 
fli © 02 , is the direct sum of gi and g 2 as a vector space, equipped with the 
bracket given by 


[(MM), (X 2 ,y 2 )] = ([M,M], [HMD 


for all X 1 ,X 2 S gi and Y\,Y 2 € 02 - 


16.4 The Matrix Exponential 

In the next section, we will associate a Lie algebra with each matrix Lie 
group. To describe this association, we need the notion of the exponential 
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of a matrix. Given a matrix X £ M n ( C), we define the matrix exponential 
of X, denoted by e x or exp(X), by the usual power series, 

vm 

e x = y' 

m! ’ 

m —0 

where X° = I (the identity matrix). This series converges absolutely for 
all X £ M n ( C), as can easily be seen using the inequality ||X m || < ||Xj| m , 
where ||X|| is the operator norm of X\ see Definition A.35. (In this, the 
finite-dimensional case, we could just as well use the Hilbert-Schmidt norm, 
which amounts to using the usual Euclidean norm on M n ( C) = C" . See 
Exercise 3.) The matrix exponential shares some but not all of the proper¬ 
ties of the exponential of a number. 

Theorem 16.15 The matrix exponential has the followinq properties for 
all X,Y £M n (C). 

1. e° = I 

2. e x = ( e x ) tr and e x = ( e x )* 

3. If A is an invertible n x n matrix, then 

e^- 1 = Ae x A~ 1 . 


4. det(e x ) = e trace ( x ) 

5. If XY = YX then e x+Y = e x e v 

6. e x is invertible and (e A ) _1 = e^ A 

7. Even if XY ^ YX, we have 

e x+Y = Inn (e x / m e Y / m 

m—too V 


Here X tr and X* denote the transpose and adjoint (conjugate transpose) 
of X, respectively. Property 7 is known as the Lie Product Formula and is 
a special case of the Trotter Product formula (Theorem 20.1). Properties 
1, 2, and 3 are easily verified using term-by-term computation. Property 6 
follows from Property 5 by taking Y = —X and applying Property 1. The 
proofs of Properties 4, 5, and 7 are outlined in Exercises 5, 6, and 7. 

Suppose a matrix X is diagonalizable, meaning that 


X = A 


/ 


V 0 An 


A~ 
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for some invertible matrix A and complex numbers Ai, A 2 ,..., A„. Then 
using Property 3 of Theorem 16.15, it is easy to see that 


3 Ai 




A- 1 . 


V 0 


If A is not diagonalizable, e x can be computed in terms of the SN decom¬ 
position of X. See Sect. 2.2 of [21] for details. 


Example 16.16 If 



a 

0 


then 


x 


cos a 
— sin a 


sin a 
cos a 


Proof. The eigenvalues of A" are ±ia and the corresponding eigenvectors 
are (l,±i). Thus, we may calculate that 


e 


x 


(l 1 \ / e ia 0 \ 1 ( -i 
\ i -i ) \ 0 e~ la J (-2*) V —i 

1 

~2i 



-1 

1 


which simplifies to the desired result. ■ 

The relation e x+y = e x e l certainly does not hold for general (noncom¬ 
muting) matrices X and Y. Nevertheless, for any A' £ M n { C) we have 

e (s+t)x = e s x e tx 


for all s and t in K, since sA commutes with tX. Thus, for each A", the set 
of matrices of the form e tx , t £ R, forms a subgroup of GL(n\ C). It is not 
hard to show (Exercise 4), using term-by-term differentiation, that 

= A. (16.1) 

t =0 

Here, the derivative of a matrix-valued function is defined as being entry- 
wise. [That is, if f(t) is a matrix-valued function, df /dt is the matrix-valued 
function whose (j. A:) entry is d(f(t)jk)/dt.} 

Definition 16.17 A one-parameter subgroup o/GL(n;C) is a continu¬ 
ous homomorphism o/K into GL(n;(C), that is, a continuous map A : R —> 
GL(n;C) such that A(0) = I and A(s + t) = A(s)A(f) for all s,t £ R. 
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Theorem 16.18 If A(-) is a one-parameter subgroup o/GL(n;C), there 
exists a unique X £ M n ( C) such that 

Aft) = e tx 


for all t £ R. 

This is Theorem 2.13 in [21]. 

16.5 The Lie Algebra of a Matrix Lie Group 

We now associate a Lie algebra g to each matrix Lie group G. 

Definition 16.19 If G C GL(n;C) is a matrix Lie group, then the Lie 
algebra g of G is defined as follows: 

g={l£ Af n (C) \e tx £ G for all t £ R} . 

That is to say, X belongs to g if and only if the one-parameter subgroup 
generated by X lies entirely in G. Note that to have X belong to g, we 
need only have e tx belong to G for all real numbers t. 

Proposition 16.20 For any matrix Lie group G, the Lie algebra g of G 
has the following properties. 

1. The zero matrix 0 belongs to g. 

2. For all X in 0 , tX belongs to g for all real numbers t. 

3. For all X and Y in g, X + Y belongs to g. 

f. For all A £ G and X £ g we have AXA~ x £ g. 

5. For all X and Y in g, the commutator [A", T] := XY — YX belongs 
to g. 

The first three properties of g say that g is a real vector space. Since 
M n ( C) is an associative algebra under the operation of matrix multipli¬ 
cation, the last property of g shows that g is a real Lie algebra (Exam¬ 
ple 16.11). 

Proof. Points 1 and 2 are elementary, and Point 3 follows from the Lie 
product formula, using the assumption that G is closed. Point 4 follows 
from Property 3 in Theorem 16.15. To verify Point 5, we observe that the 
commutator [A, Y] may be computed as 

[X, Y] = j t e tx Ye~ tx 


t=0 
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using (4) and an easily verified product rule for differentiation of matrix- 
valued functions. For X,Y £ g, e tx Ye~ tx belongs to g for all t £ M, by 
Point 4. Furthermore, we have already shown that g is a real subspace of 
M n { C) and therefore a closed subset of M n ( C). Thus, 


[X,F]= lim 

h —^0 


e hx Ye~ hx - Y 
h 


belongs to g. ■ 

Example 16.21 Letgl(n;C), gl(n;K), sl(n;C), andsl(n;K) denote the Lie 
algebras o/GL(n;C), GL(n;R), SL(n;C), cmdSL(n;R), respectively. Then 
we have 


gl(n;C) = M„(C) 
gl (n;K) = M n (K) 

sl(n;C) = {X £ M n ( C) |trace(X) = 0} 
sl(n; K) = {X £ M n ( K) |trace(X) = 0} . 

Proof. Let us consider, for example, the case of sl(n;C). By Property 4 of 
Theorem 16.15, if trace(X) = 0, then 

det(e tx ) = e ttrace W = e ° = 1, 

so that e tx £ SL(n;C). In the other direction, if X £ sl(n;C), then by 
the above calculation, we must have e ttrace W = o for all t £ M, which is 
possible only if trace(X) = 0. The proofs of the other cases are similar and 
are omitted. ■ 

Example 16.22 The Lie algebras u(n) and su(n) o/U(n) and SU(n) are 
given by 


u(n) = {X £ Af„(C) \X* = -X} 
su(n) = {X £ u(n) |trace(X) = 0} . 

The Lie algebra so(n) o/SO(n) is given by 

so (n) = {X £ M n (R) \X tr = -X} . 

Finally, the Lie algebra of 0(n) is equal to so(n). 

Proof. If X* = —X, then by Property 2 of Theorem 16.15, 

0 e tx )* = e tx * = e~ tx = (e tx )-\ 

showing that e tx is unitary. In the other direction, if e tx is unitary for all 
t £ R, then ( e tx )* = (e tx ) _1 = e~ tx . Thus, e tx = e~ tx . Differentiating 
this relation at t = 0, using (16.1), gives X* = —A". Thus, the Lie algebra of 
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U(n) consists exactly of the matrices with the property that X* = — X. For 
the Lie algebra of SU(n), we add the trace-zero condition, as in the proof 
of Example 16.21. The calculations for SO (n) are similar and are omitted. 
Note that if X £ M„(R) satisfies X tr = —A', then the diagonal entries of X 
are zero and, thus, trace(A') is automatically 0. This observation explains 
why the Lie algebras of 0(n) and SO (n) are the same. ■ 

Specializing Proposition 16.22 the case n = 3 gives 


r ( o 

a 

b \ 

) 

11 ~ a 

0 

C 

a, b, c £ M 

l V ~b 

—c 

0/ 

1 


We can use the following basis for so(3): 

/ 0 0 0 \ / 0 0 1 \ / 0 -1 0 \ 

Fl := 0 0 -1 ; F 2 := 0 0 0 ; F 3 := 1 0 0. 

\ 0 1 0 / \ -1 0 0 / \ 0 00 / 

( 16 . 2 ) 

Direct calculation establishes the following commutation relations for the 

Fj’s: 

[F 1 ,F 2 ] = F 3 
[F 2 , F 3 ] = Fi 

[F 3 ,F 1 ]=F 2 . (16.3) 

More concisely, we have [Fi,^] = F 3 , together with relations obtained 
from this one by cyclic permutation of the indices. Note that all remaining 
commutation relations follow from (16.3) by means of the skew-symmetry 
of the bracket; we have, for example, [F 2 ,Fi] = —F 3 and [Fi,Fi] = 0. 


16.6 Relationships Between Lie Groups and Lie 
Algebras 

In this section, we explore the relationships between matrix Lie groups and 
their Lie algebras. In particular, we investigate the question of the extent 
to which a matrix Lie group is determined (up to isomorphism) by its Lie 
algebra. We begin by showing that every Lie group homomorphism gives 
rise to a Lie algebra homomorphism in a natural way. 

Theorem 16.23 Suppose G\ and G 2 are matrix Lie groups with Lie al¬ 
gebras £)i and g 2 , respectively, and suppose $ : G\ —»■ G 2 is a Lie group 
homomorphism. Then there exists a unique linear map <f> : 0 i — > g 2 such 
that 

$(e tA ) = e^ (x) 
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for all t £ M and X £ g. This linear map has the following additional 
properties: 

1. 0([X, Y]) = [0(X), <f>(Y)] for all X,Y £ g 

,§. <^>(AWA _1 ) = 4>(A)</>(X)<f>(A ) _1 for all A £ G and X £ g 

3. 4>{X) may be computed as 

d>(X)= U(e tx ) . 

UL t=0 

Point 1 shows that <f> is a Lie algebra homomorphism. Part of the assertion 
of Point 3 of the theorem is that &(e tx ) is a smooth function of t for each X. 

To construct </>, note that since <f> is a continuous homomorphism, the 
map t i—^ $>(e tx ) is a one-parameter subgroup. By Theorem 16.18, there 
exists a unique Y such that <J?(e tA ) = e n for all t £ K. We then set 
<j>(X) — Y. An argument similar to the proof of Proposition 16.20 then 
establishes the desired properties of <j>. See the proof of Theorem 2.21 in 
[ 21 ] for the details. 

Corollary 16.24 Suppose that G\ and G 2 are matrix Lie groups with Lie 
algebras gi and 02 , respectively. If G 1 is isomorphic to G 2 , then gi is iso¬ 
morphic to 02 - 

Proof. See Exercise 11. ■ 

Our next task is to show that for any matrix Lie group G, the Lie algebra 
0 of G is large enough to capture what is happening in a neighborhood of 
the identity in G. This will show, for example, that for connected matrix 
Lie groups, a Lie group homomorphism is determined by the corresponding 
Lie algebra homomorphism. 

Theorem 16.25 Let G be a matrix Lie group with Lie algebra 0 . Then 
there exists a neighborhood U of 0 in M n (<C) and a neighborhood V of I in 
M n ( C) such that the matrix exponential maps U diffeomorphically onto V 
and such that for all X £ U, we have that X belongs to g if and only if e x 
belongs to G. 

See Theorem 2.27 in [21]. This result has a number of important conse¬ 
quences. 

Corollary 16.26 Every matrix Lie group G C GL(n; C) is a real embedded 
submanifold of M n (C) with the dimension of G equal to the dimension of 
0 as a real vector space. 

The claim means, more precisely, that for each A £ G, there exists a 
neighborhood U of A and a diffeomorphism $ of U with a neighborhood 
V of 0 in M 2n2 such that 4>(t/ D G) = V D R d , where d = dim 0 . That is to 
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say, after a change of coordinates, G “looks” locally like a little piece of 
sitting inside M n ( C) = M 2 " 2 . 

Proof. We use exponential coordinates in the neighborhood V of / in 
M n { C), meaning that we write each element A of V as A = e x , with 
X £ U. Theorem 16.25 says that near the identity, in these coordinates, G 
“looks like” the real vector space 9 inside M n ( C). Given any other point 
A £ G, we can use left multiplication by A~ x to move the action to the 
identity (Exercise 17), with the result that G looks like g C M n { C) near A. 
Thus, G is a real embedded submanifold of dimension d = dimg. ■ 

Corollary 16.27 The Lie algebra g of a matrix Lie group G is the tangent 
space to G at I. That is to say, g coincides with the set of those X in M n (C) 
for which there exists a smooth curve 7 : K —> M n (C) lying entirely in G 
and such that 7 ( 0 ) = I and 7 , ( 0 ) = X. 

Proof. If X £ g, then X is the derivative of e tx at t — 0, so g is contained 
in the tangent space at I. In the other direction, if 7 is any smooth curve 
in M n ( C) that lies entirely in G and passes through I at t = 0, then by 
Theorem 16.25, we can express 7 as 7 (f) = e 6 ^ (at least for small t), where 
S is a smooth curve in g with <5(0) = 0. It is then easy to see (Exercise 8) 
that 7 '( 0 ) = S'(0). But if S lies in g, then <5'(0), which equals 7 '( 0 ), also lies 
in g, as in the proof of Proposition 16.20. Thus, the tangent space at I is 
contained in g. ■ 

Corollary 16.28 If a matrix Lie group G is connected, then for all A £ G 
there exists a finite sequence Xi,X 2 ,..., Xjv of elements of g such that 

A = e Xl e X2 ■ ■ ■ e Xw . 

Proof. If G is connected in the sense of Definition 16.6 (which really means 
that G is path connected), then G is certainly connected in the usual topo¬ 
logical sense of having no nontrivial sets that are both open and closed. 
Let U denote the set of points in G that can be expressed as a product 
of exponentials of elements of g. This set is open in G because if A £ U 
and B £ G is close to A, then A~ X B is close to I in G, and therefore 
A~ l B = e x for some X £ g. Thus, B = Ae x , which means that B is also 
a product of exponentials. In the other direction, if B £ G is in the closure 
of U, then there is some element A of U that is close to B. We then have, 
again, that B = Ae x for some X £ g, which, again, means that B £ U. 
Now, G is connected and U is both open and closed. Since U is nonempty 
(/ £ U), we have U = G. m 

Corollary 16.29 Suppose that G\ and G 2 are matrix Lie groups with 
Lie algebras gi and 92 , respectively. Suppose that $1 : Gi —> G 2 and 
<I >2 : Gi —> G 2 are Lie group homomorphisms, with associated Lie algebra 
homomorphisms <p 1 and fa, respectively. If G\ is connected and <p 1 = (f> 2 , 
then $1 = $ 2 . 
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Proof. The result follows from Corollary 16.28 and the condition < l>j(e x ) = 

j = 1,2. m 

We have seen that a homomorphism of matrix Lie groups gives rise to a 
homomorphism of the associated Lie algebra, and (Corollary 16.29) that if 
the domain group is connected, the Lie algebra homomorphism determines 
the Lie group homomorphism. A more difficult question is whether we can 
go in the opposite direction, from a Lie algebra homomorphism to a Lie 
group homomorphism. That is to say, given a Lie algebra homomorphism 
between the Lie algebras of two matrix Lie groups, does there exist a Lie 
group homomorphism related in the usual way to the Lie algebra homomor¬ 
phism? The answer turns out to be yes, provided that the domain group 
Gi is connected and simply connected (i.e., that every continuous loop in 
G\ can be shrunk continuously in Gi to a point). 

Theorem 16.30 Suppose that G i and Gi are matrix Lie groups with Lie 
algebras gi and $ 2 , respectively, and suppose that <j> : gi —> $2 is a Lie 
algebra homomorphism. If G i is connected and simply connected, then 
there exists a unique Lie group homomorphism $ : Gi —> Gi such that $ 
and (f> are related as in Theorem 16.23. 

One way to prove this deep result is to make use of the Baker-Campbell- 
Hausdorff formula. (See, e.g., Chap. 3 of [21].) This formula states that for 
all sufficiently small X and Y in M n ( C) we have 

e X e y = e X+Y+i[X,Y]+±[X,[X,Y]}~f s [Y,[X,Y]]+...' 

Here • • • denotes terms that are expressible in terms of repeated commu¬ 
tators involving X and Y, with coefficients that are “universal,” that is, 
independent of n (the size of the matrices) and of the choice of A" and Y in 
M n ( C). Given a Lie algebra homomorphism <fi : gi —> g 2 , one can use the 
Baker-Campbell Hausdorff formula to construct a “local homomorphism,” 
mapping a neighborhood of the identity in Gi into Gi. If Gi is connected 
and simply connected, it is possible to extend this local representation to a 
global representation. See Sect. 3.6 of [21] for the details of this construc¬ 
tion. 

Corollary 16.31 Suppose that G\ and Gi are matrix Lie groups with Lie 
algebras gi and gi, respectively. If G\ and Gi are connected and simply 
connected and gi is isomorphic to gi, then G\ is isomorphic to Gi. 

Proof. Suppose </> : gi —> gi is a Lie algebra isomorphism. Since Gi is 
connected and simply connected, there exists a Lie group homomorphism 

: Gi —> Gi related in the usual way to <f. Since Gi is connected and 
simply connected, there exists a Lie group homomorphism ’I' : Gi —> Gi 
related in the usual way to <fr 1 . Consider now the homomorphism f o$: 
Gi -A Gi. 
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By the composition property of Lie algebra homomorphisms (Exercise 10) , 
the Lie algebra homomorphism associated with Tod) is <(> _1 o <f> = I. It then 
follows from Corollary 16.29 that \E , o<I> = /. A similar argument shows that 
$ o $ = /, which means that <f> is a Lie group isomorphism. ■ 

Corollary 16.31 does not hold without the assumption that both groups 
are simply connected, as the following important example shows. 

Example 16.32 The Lie algebras su(2) and so(3) are isomorphic, but the 
groups SU(2) and SO(3) are not isomorphic. 

Since SU(2) is simply connected (Example 16.9), SO(3) must fail to be 
simply connected. Indeed, 7n(SO(3)) = Z/2, as can be seen from Exam¬ 
ple 16.34. 

Proof. The Lie algebra su(2) of SU(2) is the space of 2 x 2 skew-self-adjoint 
matrices with trace zero. Explicitly, 

su(2) = I ( ! a . 6+ . ic> ) 

v ^ — b + ic —ta J 


We may consider the following basis for su(2): 



Direct calculation shows that [E±,E 2 ] = E 3 and relations obtained from 
this by cyclic permutation of the indices. These are the same relations as 
those satisfied by the basis elements Fj, j = 1,2,3, for so(3) in (16.2) 
and (16.3). Thus, there is a Lie algebra isomorphism <f> : su(2) — > so(3) such 
that <t>(Ej) = Fj, j = 1, 2, 3. 

On the other hand, there can be no isomorphism between SU(2) and 
SO(3), since SU(2) has a nontrivial center (containing at least I and —I), 
whereas the center of SO (3) is trivial (Exercise 14). ■ 

Definition 16.33 Suppose G is a connected matrix Lie group with Lie 
algebra g. A universal cover of G is an ordered pair (G, <f>) consisting 
of a simply connected matrix Lie group G and a Lie group homomorphism 
$ : G —> G such that the associated Lie algebra homomorphism 0 : g —x 0 
is an isomorphism of the Lie algebra q of G with g. The map $ is called 
the covering map for G. 

Although each Lie group has a universal cover that is again a Lie group, 
the universal cover of a matrix Lie group may not be isomorphic to any 
matrix Lie group. [The universal cover of SL(2;K), e.g., is not a matrix Lie 
group.] It can be shown, however, that if a matrix Lie group G is compact, 
then the universal cover of G is again a matrix Lie group (not necessarily 
compact). 

Suppose G is any simply connected Lie group with a Lie algebra g that 
is isomorphic to g. The choice of a particular isomorphism 0 : g —X g gives 


a,b,c€ 
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rise, by Theorem 16.30, to a Lie group homomorphism $ : G —> G, so that 
(G, $) is a universal cover of G. 

If (G, $) is a universal cover of G, it is often convenient to use the 
isomorphism <f> to identify g with g. If we follow this convention, we may 
say that a universal cover of G is a simply connected group G having “the 
same” Lie algebra as G. 

If (Gi,<I , i) and (G 2 ,$ 2 ) are two universal covers of a given matrix Lie 
group G, then there is a unique Lie group isomorphism ’I' : Gi —> G 2 such 
that <I> 2 (’f , (^l)) = $i(A) for all A £ G\. (This result follows easily from 
Corollary 16.31.) In light of this uniqueness result, we will often speak of 
“the” universal cover of G. 


Example 16.34 Let $ : SU(2) —> SO(3) be the unique Lie group homo¬ 
morphism for which the associated Lie algebra homomorphism <f> satisfies 
4>{Ej) = Fj , j = 1, 2, 3. Then ker$ = {/, — 1} and (SU(2), <f>) is a universal 
cover of SO(3). 

Proof. Since E\ is diagonal, it is easy to see that e 2nEl = —I in SU( 2). 
On the other hand, by a trivial extension of Example 16.16, we have 

/ 1 0 0 \ 

e aFl = I 0 cos a — sin a 

\ 0 sin a cos a J 

for all a £ R. In particular, e 2nFl = I. Thus, 

$(-/) = $(e 2,rEl ) = e 2nFl = I. 

This shows that —I belongs to the kernel of <f>. 

Now, since (f> is injective, $ is injective in a neighborhood of I. After all, 
given distinct elements A and B of SU( 2) near /, Theorem 16.25 tells us 
that we can express A as e x and B as e y , with X and Y being distinct 
small elements of su( 2). Then (f>(X) and <j>(Y) are distinct small elements 
of so(3). Applying Theorem 16.25 again tells us that 4 > (A) = eand 
$(B) = are distinct. 

We see, then, that ker$ is a discrete normal subgroup of SU(2). But a 
standard exercise (Exercise 1) shows that a discrete normal subgroup of a 
connected group is automatically central. On the other hand, it is easily 
verified (Exercise 2) that the center of SU(2) is {/, —I}, so ker$ cannot be 
larger than {/, — I}. 

To show that $ maps onto SO(3), we first verify (Exercise 13) that each 
element R of SO(3) can be expressed as R = e x , with X £ so(3). Since <j) 
is surjective and <b(e A ) = e^ x \ $ maps onto SO(3). ■ 
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16.7 Finite-Dimensional Representations of Lie 
Groups and Lie Algebras 

A representation of a group G is a homomorphism II of G into GL(C), 
the group of invertible linear transformations on some vector space. If II 
is injective then G is isomorphic to its image under II; thus, II serves to 
“represent” G concretely as a group of invertible linear transformations. 
(We continue to use the term “representation” even if II is not injective.) 
Similarly, a representation of a Lie algebra g is a Lie algebra homomorphism 
of g into gl(V), the space of all linear transformations of V, where we equip 
gl(V) with the bracket [ X , Y] := XY — YX. 

Recall that an action of a group G on a set X is a map from Gx X to X, 
denoted (<?, x) g-x satisfying e-x = x for all x € X and g-(h-x) = (gh)-x 
for all g,h £ G and x £ X. A representation II of G on some vector space 
V gives rise to a linear action of G on V, given by g ■ v = II(g)u. (A linear 
action is an action for which the map v > g ■ v is linear for each g.) Thus, 
we may use g ■ v as an alternative notation to 11(g)?;, when convenient. 

16.7.1 Finite-Dimensional Representations 

If G is a matrix Lie group, then G is already represented as a group of 
matrices. Nevertheless, it is of interest [as we will see in Chap. 17 in the 
case G = SO(3)] to explore other representations of G. Since a matrix Lie 
group has a topological structure (inherited from M n (C)), it is natural to 
require representations to be continuous. It is also simpler to deal at first 
with finite-dimensional representations, that is, those where the vector 
space in question is finite dimensional, although eventually we will need to 
consider infinite-dimensional representations as well. This discussion leads 
to the following definition. 

Definition 16.35 Let G C GL(n;C) be a matrix Lie group. A finite¬ 
dimensional representation of G is a continuous homomorphism of G 
into GL(Y), the group of invertible linear transformations of a finite¬ 
dimensional vector space V. 

We will assume that all of our vector spaces are over the field C, even 
though it is occasionally of interest to consider also representations over R. 
The topology on GL(C) is defined by picking a basis, and thereby identifying 
the space of linear maps of V to V with M„(C). We then use the subset 
topology on GL(R) = GL(n;C) C M n ( C). This topology is easily seen to 
be independent of the choice of basis. 

An important example of representations in quantum theory arises from 
the time-independent Schrodinger equation in R", namely the equation 
Hip = Eip, for a fixed constant E £ R. If H is invariant under rotations, 
then the space of solutions to this equation is invariant under rotations. 
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Note that an individual solution ip to this equation may or may not be a 
rotationally invariant (i.e., radial) function. But if H is rotationally invari¬ 
ant, then rotating a solution to Hip = Eip will give another solution of this 
equation. Even if the quantum Hilbert space is infinite dimensional, the 
solution spaces to Hip = Eip are typically finite dimensional and consti¬ 
tute finite dimensional representations of the group SO(n) of rotations. If 
we can understand what all possible finite-dimensional representations of 
SO (n) look like, we will have made a lot of progress in understanding solu¬ 
tions to Hip = Eip in the rotationally invariant case. This line of reasoning 
will be explored in detail in Chap. 18. 

We may consider as well finite-dimensional representations of Lie alge¬ 
bras. Assuming our Lie algebra g is finite dimensional (which is the only 
case we will consider in this chapter), there is no need to impose a re¬ 
quirement of continuity, since a linear map of one finite-dimensional real 
or complex vector space to another is automatically continuous. 

Definition 16.36 A finite-dimensional representation of a Lie algebra 
q is a Lie algebra homomorphism of g into gl(H), the space of all linear 
transformations ofV. Here gl(H) is considered as a Lie algebra with bracket 
given by [X, Y] = XY — YX. 

We typically consider Lie algebras defined over the field K, since the Lie 
algebra of a matrix Lie group is in general only a real subspace of M n (C). 
Nevertheless, it is convenient to consider vector spaces over C. If g is a 
real Lie algebra and V, and therefore also gl(H), is a complex vector space, 
then we require only that 7r : g — >■ g\{V) be real linear, which is the only 
requirement that makes sense. 

In the interest of simplifying the terminology, we will sometimes speak 
of “a representation V,” without making explicit mention of the homomor¬ 
phism n or 7 r. 

Definition 16.37 If H : G —> GL(H) is a representation of a matrix Lie 
group G, then a subspace W of V is called an invariant subspace if 
n (g)w € W for all g £ G and w € W. Similarly, if n : g — > gl(V) is 
a representation of a Lie algebra g, then a subspace W of V is called an 
invariant subspace if n(X)w € W for all X £ g and w € W. A represen¬ 
tation of a group or Lie algebra is called irreducible if the only invariant 
subspaces are W = V and W = {0}. 

Definition 16.38 If (II, Vi) and (E,!/^) are representations of a matrix 
Lie group G, a map $ : V\ —> V 2 is called an intertwining map (or 
morphism) if <I>(n(< 7 )u) = E(g)<f>(i;) for all v € V\, with an analogous 
definition for intertwining maps of Lie algebra representations. If an in¬ 
tertwining map is an invertible linear map, it is called an isomorphism. 
Two representations are said to be isomorphic (or equivalent) if there 
exists an isomorphism between them. 
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In the “action” notation, the requirement on an intertwining map $ is 
that $(<7 ' v ) = 9 ' $(w), meaning that $ commutes with the action of G. 
A typical goal of representation theory is to classify all finite-dimensional 
irreducible representations of G up to isomorphism. 

Given a representation II : G —> GL(P) of a matrix Lie group G, we 
can identify GL(P) with GL(iV;C) and gl(V) with gl(n;C) by picking a 
basis for V. We may then apply Theorem 16.23 to obtain a representation 
7 T : g —> gl(V) such that 


n(e*) = e"™ 


for all 

Proposition 16.39 Suppose G is a connected matrix Lie group with Lie 
algebra g. Suppose that II : G —> GL(P) is a finite-dimensional representa¬ 
tion of G and n : g — > gl(P) is the associated Lie algebra representation. 
Then a subspace W ofV is invariant under the action of G if and only if it 
is invariant under the action of g. In particular, II is irreducible if and only 
if it is irreducible. Furthermore, two representations of G are isomorphic if 
and only if the associated Lie algebra representations are isomorphic. 

In general, given an representation tt of g, there may be no representation 
II such that 7 r and II are related in the usual way. If, however, G is simply 
connected, Theorem 16.30 tells us that there is, in fact, a II associated with 
every 7r. 

Proof. Suppose W C k is invariant under n(X) for all X £ g. Then 
W is invariant under n(X) m for all to. Since V is finite dimensional, any 
subspace of it is automatically a closed subset and thus W is invariant 
under 



ra—0 


Since G is connected, every element of G is (Corollary 16.28) a product 
of exponentials of elements of g, and so W is invariant under 11 (A) for all 


AgG. 


In the other direction, if W is invariant under 11(A) for all A £ G, then 
since W is closed, it is invariant under 


n(X) = lim -- 

h-tO h 


for all X £ g. 

Now suppose llu and II 2 are two representations of G, acting on vector 
spaces V\ and V 2 , respectively. If $ : V\ —> V 2 is an invertible linear map, 
then an argument similar to the above shows <J>IIi(A) = Il 2 (A)d> for all 
A £ G if and only if <E> 7 Ti(X) = tt 2 {X)$ for all X £ g. Thus, $ is an 
isomorphism of group representations if and only if it is an isomorphism of 
Lie algebra representations. ■ 
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Theorem 16.40 (Schur’s Lemma) If Vi andV 2 are two irreducible rep¬ 
resentations of a group or Lie algebra, then the following hold. 

1 . If 4> : V\ —>■ V 2 is an intertwining map, then either <f> = 0 or 4> is an 
isomorphism. 

2. // <f> : 1 1 —> V 2 and 4' : V\ —> V 2 are nonzero intertwining maps, then 
there exists a nonzero constant c € C such that $ = c\f r . In particular, 
if <f> is an intertwining map of V\ to itself then 4> = cl. 

Although the first part of Schur’s lemma holds for representations over 
an arbitrary field, the second part holds only for representations over alge¬ 
braically closed fields. 

Proof. It is easy to see that ker<f> is an invariant subspace of V\. Since 
Vi is irreducible, this means that either ker4> = V\, in which case 4> = 0, 
or ker4> = {0}, in which case $ is injective. Similarly, the range of $ is 
invariant, and thus equal to either {0} or V 2 . If 4> is not zero, then the 
range of 4) is not zero, hence all of V 2 . Thus, if 4> is not zero, it is both 
injective and surjective, establishing Point 1. 

For Point 2, since 4> and 41 are nonzero, they are isomorphisms, by 
Point 1. It suffices to prove that F := 4>~ 1 \I/ is a multiple of the iden¬ 
tity, where T is an intertwining map of Vi to itself. Since we are work¬ 
ing over C, T must have at least one eigenvalue A. If IV denotes the A- 
eigenspace of T, then IV is invariant under the action of the group or Lie 
algebra. After all, if Tu; = A w, then (in the notation of the group case) 
F{U(A)w) = n(A)rui = An(A)u;. Since A is an eigenvector of T, the in¬ 
variant subspace IV is nonzero and thus W = V\, which means precisely 
that r = XI. m 

16.7.2 Unitary Representations 

In quantum mechanics, we are interested not only in vector spaces, but, 
more specifically, in Hilbert spaces, since expectation values are defined in 
terms of an inner product. We wish to consider, then, actions of a group 
that preserve the inner product as well as the linear structure. Although 
the Hilbert spaces in quantum mechanics are generally infinite dimensional, 
we restrict our attention in this section to the finite-dimensional case. 

Definition 16.41 Suppose V is a finite-dimensional Hilbert space over C. 
Denote by U(V) the group of invertible linear transformations ofV that pre¬ 
serve the inner product. A (finite-dimensional) unitary representation 
of a matrix Lie group G is a continuous homomorphism of II : G —> U(V), 
for some finite-dimensional Hilbert space V. 

Proposition 16.42 Let n : G —> GL(V) be a finite-dimensional repre¬ 
sentation of a connected matrix Lie group G, and let 7r be the associated 
representation of the Lie algebra g of G. Let (•, •) be an inner product on V. 
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Then II is unitary with respect to (■,■) if and only if tt(X) is skew-self- 
adjoint with respect to (•, •) for all X £ g, that is, if and only if 

tt(X)* = —n(X) 


for all X £ g. 

In a slight abuse of notation, we will refer to a representation 7r of a 
Lie algebra g on a finite-dimensional inner product space as unitary if 
tt(X)* = — 7 t(X) for all leg. 

Proof. Suppose first that 11(A) is unitary for all A £ G. Then for all X £ g 
and t £ R we have 

n (e tx )* = n(e tA ) -1 = n(e~ tx ) = e~ tn( - x \ 

On the other hand, 

II(e tA ')* = (e t7r(A) )* = e tn(x) *. 


Thus, 

e tir(X)* __ e -t-jr{X) 

for all t. Differentiating at t = 0 yields n(X)* = —n(X). 

In the other direction, if n(X)* = — n(X) for all I e g, then 

II(e A )* = e n( - x) * = e~^ x) = Ii{e~ x ) = n(e A )“ 1 , 

meaning that II(e A ) is unitary. Since G is connected, Corollary 16.28 tells 
us that each element A of G is expressible as a product of exponentials, 
from which it follows that 11(A) is unitary. ■ 


16.7.3 Projective Unitary Representations 

In quantum mechanics, two unit vectors in the quantum Hilbert space that 
differ by multiplication by a constant are considered to represent the same 
physical state. Thus, an operator of the form e ld J, with 9 £ R, will act as the 
identity at the level of the physical states. Suppose that V is a Hilbert space 
over C, assumed for the moment to be finite dimensional. Then it is natural 
to consider homomorphisms not into U(V') but rather into the quotient 
group \My)/{e ie I}. Of course, given a homomorphism n of G into U(H), 
we can always turn n into a homomorphism of G into the quotient group, 
just by composing n with the quotient map. Not every homomorphism into 
the quotient group, however, arises from a homomorphism into U(H). 

Definition 16.43 Suppose V is a finite-dimensional Hilbert space over C. 
Then the projective unitary group over V, denoted PU(P), is the quo¬ 
tient group 

PU(P) = U (V)/{e l6 I}, 

where {e lS I} denotes the group of matrices of the form e ie I, 9 £ R. 
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Note that {e l9 I} is a closed normal subgroup of 11(17). Now, U(V) is 
(isomorphic to) a matrix Lie group, since we can identify it with U(n) by 
picking an orthonormal basis for V. In general, the quotient of a matrix 
Lie group by a closed normal subgroup may not be a matrix Lie group. In 
this case, however, it is not hard to realize the quotient U (n)/{e l9 I} as a 
matrix Lie group. 


Proposition 16.44 IfV is a finite-dimensional Hilbert space overC, then 
PU(P) is isomorphic to a matrix Lie group. 

Let Q : U(T7) —> PU(V’) be the quotient homomorphism and let q : 
u(T7) —> pu(I7) be the associated Lie algebra homomorphism. Then q maps 
u(I7) onto pu(I7) and the kernel of q is the space of matrices of the form 
ial with a £ R. Thus, pu(P) is isomorphic to \i(V)/{iaI}. 


The Lie algebra u(P) of U(V) is the space of skew-self-adjoint operators 
on V. In Proposition 16.44, the space {ial} is an ideal in u(P) and the 
quotient is in the sense of Lie algebras over R; see Exercise 9. If dim V = N, 
then it is not hard to see that the Lie algebra pu(I7) = u(V)/{iaI} is 
isomorphic to the Lie algebra su(iV). The group PU(P) is not, however, 
isomorphic to the group SU(iV). See Exercise 16. 

Proof. If dim V = N , then gl(P), the space of all linear maps of V to V, 
has dimension N 2 . Given U € U(I7), we can define 

Cu : gl(V) gl(V) 


by 


Cu(X) = uxu- 1 . 


(That is to say, Cu is conjugation by U.) Note that ( Cu ) -1 = Cjj-i and 
Cuv = CuCy- Thus, C (i.e., the map U i —> Cu) is a homomorphism of 
U(P) into GL(gl(P)), and this homomorphism is clearly continuous. If U 
is a multiple of the identity, then Cu is the identity operator on gl(P). 
Conversely, if Cu is the identity, then UX = XU for all X € gl(P), which 
implies (Exercise 18) that U is a multiple of the identity. Thus, the kernel 
of C consists precisely of those scalar multiples of the identity that are in 
11(17); that is, kerC = {e l9 I}. 

We have constructed, then, a homomorphism of U(V') into GL(gl(P)) = 
GL(IV 2 ;C) with a kernel that is precisely {e te I}. The image of U(V) un¬ 
der this homomorphism is, therefore, isomorphic to the quotient group 
U(17)/{e* e /}. Furthermore, since U(P) is compact, the image of U(P) un¬ 
der C is compact and thus closed. This image is, then, a matrix Lie group 
isomorphic to PU(P). 
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Let c be the associated Lie algebra homomorphism associated with the 
homomorphism C. Using Point 3 of Theorem 16.23, we may calculate that 

cx{Y) = ±e tx Ye~ tx 

dt t =o 

= XY-YX 

= \x,n 

Using Exercise 18 again, we see that cx = 0 if and only if X is a multiple 
of the identity. Thus, the kernel of c consists of all the scalar multiples of 
I in u(V), namely {ial}. 

Now, the image of U(U) under C is (isomorphic to) PU(V'); in particular, 
C maps U(U) onto PU(V'). It follows that c must map u(U) onto pu(U). 
(This claim follows from Theorem 3.15 in [21].) Thus, pu(U) = u (V)/{iaI}. 


Definition 16.45 A finite-dimensional projective unitary representa¬ 
tion of a matrix Lie group G is a continuous homomorphism II of G into 
PU(U), where V is a finite-dimensional Hilbert space over C. A subspace 
W ofV is said to be invariant underH if for each A £ G, W is invariant 
under U for every U £ U(U) such that [U] = 11(A). A projective unitary 
representation (II, V) is irreducible if the only invariant subspaces are {0} 
and V. 

Given an ordinary unitary representation, Y, : G —> U(U), we can always 
form a projective representation, II : G PU(V), simply by setting II = 
Q o E. Not every projective representation, however, arises in this fashion. 
Thus, considering projective representations gives us more flexibility than 
considering ordinary unitary representations. 

Proposition 16.46 Let II : G —> PU(V) be a finite-dimensional projective 
unitary representation of a matrix Lie group G, and let n : g pu(U) be 
the associated Lie algebra homomorphism. Then there exists a Lie algebra 
homomorphism a : g — > u(U) such that tt(X) = q(a(X)) for all X £ g. 
It is possible to choose a so that trace(cr(X)) = 0 for all X £ g, and a is 
unique if we require this condition. 

That is to say, every finite-dimensional projective representation can be 
“de-projectivized” at the Lie algebra level. In general, a is not unique, 
because there may be <r’s for which trace(cr(X)) is nonzero for some X. 
On the other hand, if g has the property that every X £ g is a linear 
combination of commutators—which is true if g = so(3)—then a is unique. 
See Exercise 15. 

Proof. Recall that pu(U) = u (V)/{iaI}. That is, for each X £ g, n(X) 
denotes a whole family of operator that differ by adding ial. If Y £ u (n) 
is any representative of n(X), then since Y* = —Y, the trace of Y will 
be pure imaginary. Thus, there is a unique pure-imaginary constant c = 
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—trace(F)/dim V such that the trace of Y + cl is zero. Let us then set 
cr(X) = Y + cl. Since ir is a Lie algebra homomorphism, u([A', Y]) will 
equal [a{X),a(Y)\ + ial, for some a G I. Since trace(cr([X, 1"])) = 0 by 
construction and since the commutator of any two matrices has trace zero, 
we see that actually a = 0. Thus, a a as in the proposition exists, and it is 
unique if we require that <r(X) have trace zero. ■ 

Theorem 16.47 Suppose G is a matrix Lie group and G is a universal 
cover of G, with covering map $. Then the following hold. 

1. Let II : G —> PU(V) be a finite-dimensional projective unitary rep¬ 
resentation of G. Then there is an ordinary unitary representation 
E : G —> U(V) of G such that II o 4> = Q o E. Any such E is irre¬ 
ducible if and only if II is irreducible. It is possible to choose E so 
that det(E(A)) = 1 for all A £ G, and E is unique if we require this 
condition. 

2. Let E be a finite-dimensional irreducible unitary representation of G. 
Then the kernel of the associated projective unitary representation 
QoTi contains the kernel of the covering map 4>. Thus, Q o E factors 
through G and gives rise to a projective unitary representation of G. 

In the finite-dimensional case, then, there is a one-to-one correspondence 
between irreducible projective unitary representations of G and irreducible, 
determinant-one ordinary unitary representations of G. Point 1 of the the¬ 
orem means that any finite-dimensional projective unitary representation 
of the group G can be “de-projectivized” at the expense of passing to the 
universal cover G of G. 

Note that Theorem 16.47 applies only to finite-dimensional projective 
unitary representations. Example 16.56 will provide an infinite-dimensional 
example in which Point 1 of the theorem fails. 

Proof. If q is the Lie algebra of G, Proposition 16.46 tells us that we can 
find an ordinary representation a : g —> u(P) such that q o a = ir. We then 
define a representation a : g —> u(V) of the Lie algebra q of G by setting 
cr(X) = cr((j>(X)), I £ §. Since G is simply connected, we can then find 
a unique representation E : G —> U(V’) such that E(e x ) = c a{x ' 1 for all 
I £ 0 . Since 

qoa = qoao(j) = TTO(l), 

it follows that QoE = IIo<f>. Furthermore, if E maps into SU(V), cr = docj>~ 1 
maps into su(n). This condition uniquely determines a and thus also d and 
E, establishing Point 1 of the theorem. 

For Point 2, observe that ker 4> is a discrete normal subgroup of G, which 
is therefore central (Exercises 1 and 12). Thus, for all A £ ker<f>, we have 


E(A)E(B) = E(AB) = E (BA) = E(B)E(A) 
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for all B € G. That is to say, E(A) is an intertwining map of V to itself. 
Since V is also irreducible as a representation of G, Schur’s lemma tells us 
that E(A) = cl, where |c| = 1 because E(A) 6 U(V'). Thus, A is in the 
kernel of the associated projective representation Q o E. ■ 


16.8 New Representations from Old 


In this section, we consider three basic mechanisms for combining repre¬ 
sentations to produce new representations: direct sums, tensor products, 
and duals. This section assumes familiarity with these notions at the level 
of vector spaces; a brief review is provided in Appendix A.l. 

Definition 16.48 Suppose (Ip, Vj) and (Ipjlp) nre representations of a 
matrix Lie group G. The direct sum of these two representations is the 
representation Ip © Ip : G —> GL(Vi © P 2 ) given by 

(IP © IP)(A) = IP (A) © IP (A). 


The tensor product of Ip and Ip is the representation Ip © Ip : G —> 
GL(Fi © TP) given by 

(IP © IP)(A) = ip (A) © IP(A). 

Finally, the dual of Ip is the representation lip : G —> GL(P*) given by 

nf(A) = n 1 (A- 1 ) tr = (ip(A) tr ) -1 . 

Similarly, the direct sum, tensor product, and dual of Lie algebra repre¬ 
sentations can be defined by 

("TT! © 7T 2 )(X) = 7Ti(A) © 7T 2 (X) 

(7Tl © 7T 2 )(X) = 7Tl(X) © / + / © 7T 2 (X) 

Trf (X) = 7T! (X)^ r . 


It is important to note the differences in formulas between the group and 
the Lie algebra in the case of tensor products and dual representations. It 
is easy to motivate the definitions for the Lie algebra: If G acts on V\ © P 2 
by III (A) <8 IP (A), then the associated Lie algebra action will be given by 


^-iP(e tx ) ©II 2 (e tx ) 


7Ti(X) © I + I © 7 r 2 (X). 


Of course, we continue to use this last formula for tensor products of Lie 
algebra representations, even if there is no associated group representations. 
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Remark 16.49 7/(IIi,Vi) and (Il 2 ,V 2 ) are representations of a group G , 
it is possible to view V± ® V 2 as a representation of the direct product group 
G x G, by setting 


(IE ® n 2 )(a, b) = n,(A) ® n 2 (b). 

Similarly, if (tti,Vi) and (7 t 2 ,V2) are representations of a Lie algebra g, it 
is possible to view V\ ® V 2 as a representation of g © g by setting 

(7Tl ® TT 2 ){X,Y) = TTl{X) ® J + / ® 7T 2 (y). 

Nevertheless, it is, in most cases, more natural to view V\ ® V 2 as a 
representation of G itself, rather than of G x G. Even if V\ and V 2 are 
irreducible representations of G, the space V\ ® V 2 will in most cases fail 
to be irreducible as a representation of G. If, for example, we take V\ = 
V 2 = V, then the space of symmetric tensors inside V ® V will form a 
nontrivial invariant subspace, unless dimE = 1. An important problem in 
representation theory is to decompose V\ ® V 2 as a direct sum of irreducible 
representations, where V\ and V 2 are irreducible representations of a fixed 
group or Lie algebra. In the case of the Lie algebra su(2), this decomposition 
is discussed in Sect. 17.9. 

Definition 16.50 A finite-dimensional representation of a group or Lie 
algebra is said to be completely reducible if it is isomorphic to a direct 
sum of irreducible representations. 

Proposition 16.51 Every finite-dimensional unitary representation of a 
group or Lie algebra is completely reducible. 

Proof. Suppose (II, V) is a unitary representation of a matrix Lie group G. 
If IE is a subspace of V invariant under each 11(A), then W 1 - is invariant 
under each 11(A)*, as the reader may easily verify. But since II is unitary, 

n(A)* = n(A)- 1 = n(A“ 1 ). 

Thus, W 1 - is invariant under II(A^ 1 ) for all A £ G, hence under 11(A) for all 
A € G. We conclude that, in the unitary case, the orthogonal complement 
of an invariant subspace is always invariant. 

If V is irreducible, there is nothing to prove. If not, we pick a nontrivial 
invariant subspace W and decompose V as W © W^. The restriction of II 
to W or to W 1 - is again a unitary representation, so we can repeat this 
procedure for each of these subspaces. Since V is finite dimensional, the 
process must eventually terminate, yielding an orthogonal decomposition 
of V as a direct sum of irreducible invariant subspaces. 

If we consider a unitary representation it of a Lie algebra g, we have 
the same argument, but with the identity 11(A)* = II(A _1 ) replaced by 
7t(A')* = - 7 t{X). m 
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Proposition 16.52 Suppose K is a compact matrix Lie group. For any 
finite-dimensional representation (II, V) of K, there exists an inner product 
on V such that 11(A) is unitary for all A £ G. In particular, every finite¬ 
dimensional representation of K is completely reducible. 

See Proposition 4.36 in [21]. 


16.9 Infinite-Dimensional Unitary Representations 

For the applications we have in mind, we need to consider representa¬ 
tions that are infinite dimensional. The theory of such representations is 
inevitably more complicated than that of finite-dimensional representa¬ 
tions. For our purposes, it suffices to consider the nicest sort of infinite¬ 
dimensional representations—unitary representations in a Hilbert space. 

16.9.1 Ordinary Unitary Representations 

We begin by considering ordinary representations and then turn to projec¬ 
tive representations. 

Definition 16.53 Suppose G is a matrix Lie group. Then a unitary rep¬ 
resentation of G is a strongly continuous homomorphism n : G — > U(H), 
where H is a separable Hilbert space and U(H) is the group of unitary op¬ 
erators on H. Here, strong continuity of n means that if a sequence A m in 
G converges to A £ G, then 

lira \\Tl(A m )ip - n(A)y|| = 0 

ra-*oo 

for all ip£ H. 

We can attempt to associate to a unitary representation n of G some 
sort of representation 7 r of the Lie algebra g of G , by imitating the con¬ 
struction in Theorem 16.23. For any X £ g, the map t K > H(e tx ) is a 
strongly continuous one-parameter unitary group. Thus, Stone’s theorem 
(Theorem 10.15) tells us that there exists a unique self-adjoint operator A 
such that n(e tA ) = e ltA for all t £ M. If we let n(X) denote the skew-self- 
adjoint operator iA, wc will have 

n(e tx ) = e tv(x) . (16.5) 

The operators X £ g, are in general unbounded and defined only 

on a dense subspace of H. Nevertheless, it can be shown (see, e.g., [43]) 
that there exists a dense subspace V of H contained in the domain of 
each n(X) and that is invariant under each n(X), and on which we have 
7 r([X, y]) = [k{X),tt{Y)\. In the case of the particular representation that 
we will consider in the next chapter, we can avoid these difficulties by 
looking at finite-dimensional invariant subspaces. 
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Proposition 16.54 Suppose G is a matrix Lie group and II : G —► U(H) is 
a unitary representation of G. For each let tt(X) denote the operator 

in (16.5). Suppose V C H is a finite-dimensional subspace of H such that 
11(A) maps V into V, for all A G G. Then for all X £ g, V C Dom( 7 r(X)), 
7 r(X) maps V into V, and we have 

n([X, Y])v = [tt(X), t t(Y)]v (16.6) 


for all v € V. 

In the other direction, suppose G is connected and suppose V is any 
finite-dimensional subspace of H such that for all X £ g, V C Dom( 7 r(X)) 
and 7 r(X) maps V into V. Then 11(A) also maps V into V, for all A £ G. 

Proof. Since V is invariant under both 11(A) and 11(A)* = II(A _1 ), the 
restriction to V of each 11 (A) is unitary. The operators II(A)|y form a 
finite-dimensional unitary representation of G that is strongly continuous 
and thus continuous. (In the finite-dimensional case, all reasonable notions 
of continuity for representations coincide.) For each X £ g, Theorem 16.18 
tells us that there is an operator X on V such that 

II(e tx )\ v =e t *. 

Thus, for any v € V, we have 


lim 

t —>-0 


U(e tx )v - v 
t 


.. e tx v — v 

lim- 

t —>o t 


= Xv. 


This calculation shows that v is in the domain of the infinitesimal gener¬ 
ator tt(X) of the unitary group II(e tA ), and that n(X)v = Xv. Since the 
operators X, X £ g, form a representation of 9 , we have the relation (16.6). 

In the other direction, if V is invariant under tt(X), the restriction of 
7 t(X) to V is automatically bounded. Thus, there is a constant C such that 


||7r(X) m 7>|| < C m Il'i'H 


(16.7) 


for all v £ V. If we use the direct-integral form of the spectral theorem 
for the self-adjoint operator A := — in(X), it is easy to see that (16.7) can 
only hold if v, viewed as an element of the direct integral, is supported on 
a bounded interval inside the spectrum of A. Since the power series of the 
function A 1 —> e tx converges to e tA uniformly on any finite interval, we will 
have 


Il{e tJi )v = e ltA v = 


OO 

E 

m =0 


t m ir(X) r 


TO! 


Each term in the above power series belongs to V. which is finite dimen¬ 
sional and thus closed. We conclude that II(e tx )v belongs to V for all 
I £ g. Since G is connected, each element of G is a product of exponen¬ 
tials of Lie algebra elements, and we have the claim. ■ 
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16.9.2 Projective Unitary Representations 

Given a Hilbert space H, let S n denote the unit sphere in H, that is, the 
set of vectors with norm 1 . Let PH be the quotient space (S' 11 )/ where 
denotes the equivalence relation in which u ~ v if and only if u = e lS v 
for some 0 £ I. The quotient map q : S H —»■ PH induces a topology on 
PH in which a set U C PH is open if and only if ( 7 - 1 (t/) is open as a 
subset of the metric space S H C H. 

As in the finite-dimensional case, we can form the quotient group 

PU(H) := U(H )/{e ie I}. 

The action of U(H) on S H descends to a well-defined action of PU(H) 
on PH. 

Definition 16.55 A projective unitary representation of a matrix Lie 
group G is a homomorphism n : G —> PU(H), for some Hilbert space H, 
with the property that if a sequence A m in G converges to A in G, then 

n (A m )x —t n(A)a: 

for all x £ PH. 

Recall that in the finite-dimensional case, every projective unitary rep¬ 
resentation of G can be “de-projectivized” at the expense of possibly having 
to pass to the universal cover G of G (Theorem 16.47). The 
de-projectivization proceeds by passing to the Lie algebra, choosing the 
trace-zero representative of each equivalence class, and then exponentiat¬ 
ing back to the universal cover of the original group. This approach does 
not work in the infinite-dimensional case. After all, even assuming we can 
construct a Lie algebra homomorphism 7 r(X) for each X £ g, the repre¬ 
sentatives of 7 r(X) are typically unbounded operators on H, for which the 
notion of trace does not make sense. This difficulty is not just a technical¬ 
ity; the corresponding result in the infinite-dimensional case is false, as we 
will now see. 

Example 16.56 For all (a,b) £ R 2 , define an operator P( a ,&) on P 2 (R) by 
(T( a , b )il>){x) = e mx ip(x - b ). 

Then T) a 6 ) is unitary for all (a, b) £ R 2 and we have 

(T {a , b) T (a , M) i>) Or) = e iax e ia 'U-^ x - (,b + b ')) 

= c~ ia ' b (r ( 0 + 0 q 6 + 6 ,)V>) Or). (16.8) 

The map ( a,b ) K > [TA b )] is a homomorphism of R 2 into PU(P 2 (R)), and 
this homomorphism is continuous in the sense of Definition 16.55. There 
does not, however, exist any homomorphism S : R 2 —> U(P 2 (R)) such that 
[S/a.fc)] = [T(a,b)] for all ( a,b ) G R 2 . 
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Thus, even though M 2 is simply connected (and thus its own universal 
cover), there is no way to de-projectivize the projective unitary represen¬ 
tation (a, b) i->- [T (0i6) ] of R 2 . 

Proof. The map (a, b ) —> T( a ,b) is easily seen to be strongly continuous, 
and thus the map (a, b) K► \T^ a y\ is continuous in the sense of Defini¬ 
tion 16.55. If a homomorphism S with the indicated properties existed, 
then there would be constants 9 a ^ such that S( a y = e z6a ' b T( a ,b)- But then 
since S' is a homomorphism from the commutative group R 2 into U(L 2 (R)), 
the operator Syy would have to commute with S^y) for all (a, b) and 
(■ a',b'). But then the operators Tr a y an d T^y^ being constant multiples 
of commuting operators, would need to commute as well. But this is not the 
case; for example, T( a ,o) does not commute with Tr$y), as is easily verified 
using (16.8). ■ 

Despite the negative result in Example 16.56, there is a positive result in 
this direction: If G is connected and “semi-simple,” every projective unitary 
representation of G can be de-projectivized after passing to the universal 
cover. Here, a Lie algebra g is said to be simple if g has no nontrivial ideals 
and dimg > 2. A Lie algebra is said to be semi-simple if it is a direct sum 
of simple algebras. Finally, a Lie group G is said to be semi-simple if the 
Lie algebra g of G is semi-simple. 

For any connected Lie group G, a projective unitary representation n of 
G can be de-projectivized by passing to a one-dimensional central exten¬ 
sion. A one-dimensional central extension of G is a Lie group G' together 
with a surjective homomorphism $ : G' —> G such that the kernel of $ is 
one-dimensional and contained in the center of G'. See the article [1] of V. 
Bargmann for more information about these issues. 


16.10 Exercises 

1. Suppose that G is a connected matrix Lie group and that IV is a 
discrete normal subgroup of G, meaning that there is some neighbor¬ 
hood U of I in G such that U f~l N = {/}. Show that N is contained 
in the center of G. 

Hint: Consider the quantity gng^ 1 for g £ G and n £ N. 

2. (a) Suppose two elements U and V of SU(2) commute. Show that 

each eigenspace for U is invariant under V and vice versa. 

(b) Show that if U is in the center of S U (2), then U = I or U = — I. 

3. Define the Hilbert-Schmidt norm of a matrix X £ M n (C) by the 
formula 

n 

imiHs= E i^i 2 - 

j,k =i 
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Using the Cauchy-Schwarz inequality, show that 


II^IIhs < II^IIhs II^IIhs 


(16.9) 


for all X, Y £ M n { C). 

4. Using term-by-term differentiation of power series, show that for all 
X £ M„(C) and all 1 < j, k < n, we have 


d_ 

dt 



= X 


t=o 


jk- 


5. Verify Property 4 of Theorem 16.15. This should be easy in the case 
that X is diagonalizable. In the general case, either use the Jordan 
canonical form or appeal to the fact that diagonalizable matrices are 
dense in M n ( C). 

6. Suppose X and Y are commuting n x n matrices. Show that 

x Y „x+y 
e e = e . 


This is Property 5 of Theorem 16.15. 

Hint: Multiply together the power series for e x and e Y and then 
group terms where the total power of X and Y is n. 

7. For A £ M n { C), define the logarithm of A by the power series 


log A = A — / — 


{A-If [A-If 

2 + 3 


whenever this series converges. Assume the following result: If A is 
sufficiently close to J, then log A is defined and exp(logA) = A. 
[This can be seen easily when A is diagonalizable, and the set of 
diagonalizable matrices is dense in M n ( C).] 


(a) Show that there exists a constant C such that for all A with 
|| A — /|| < 1/2 we have 

11 log A — (A — A) || < U ||A — I\\ 2 . 

(b) Show that for all X, Y £ M n (C) we have 

log(e x / m e Y / m ) = - + -+o(^]. (16.10) 

V /mm \m z J 

Note that e x l m e yi m tends to I as m tends to infinity, so that 
the left-hand side of (16.10) is defined for all sufficiently large m. 

(c) Prove the Lie Product Formula. 
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8 . (a) Show that for all X, Y £ M n ( C), 


J x+tY) ’ 


t=0 


<m||xr- 1 ||y|| 


(b) Show that the map X H > e tx is a continuously differentiable 
map of M n ( C) = R 2n to itself. 

(c) Using Exercise 4, show that the differential of the map X H > e x 
at X = 0 is the identity map of M n (C) to itself. (Recall that the 
differential of smooth map of R J to R fc , evaluated at a point in 
R J , is a linear map of R J to R fc .) 

9. Suppose g is a Lie algebra and f) is an ideal in g. Let g/f) denote the 
vector space quotient of g by t). Show that the bracket on g descends 
unambiguously to a bilinear map on g/f), and that g/f) forms a Lie 
algebra under this map. 


10. Suppose that Gi, G 2 , and G 3 are matrix Lie groups with Lie algebras 
gi, g2, and g3, respectively. Suppose that 4 > : Gi — > G2 and : 
G 2 —> G 3 are Lie group homomorphisms with associated Lie algebra 
homomorphisms <j) and %[>, respectively. Show that the Lie algebra 
homomorphism associated to W o $ : Gi —> G 3 is i/j ° (j). 


11. Show that isomorphic matrix Lie groups have isomorphic Lie alge¬ 
bras. 


12. Suppose G 1 and G 2 are matrix Lie groups with Lie algebras gi and 
02 , respectively. Suppose : Gi — > G 2 is a Lie group homomorphism 
with the property that the associated Lie algebra homomorphism 
<fi '■ 01 ~> B2 is injective. Show that there exists a neighborhood U of 
the identity in Gi such that U fl ker $ = {/}. 

Hint: Use Theorem 16.25. 

13. (a) Show that every R £ SO(3) has an eigenvalue of 1. 

(b) Show that every R £ SO(3) is conjugate in SO(3) to matrix of 
the form 

( 1 0 0 \ 

I 0 cos 9 — sin 9 I 

y 0 sin 0 cos 9 J 

for some 9 £ R. 

(c) Show that the exponential map from so(3) to SO(3) is surjective. 

(d) Show that SO(3) is connected. 

14. Show that the center of SO(3) is trivial. 

Hint: Use Part (a) of Exercise 13. 
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15. Given a Lie algebra g, let [g, g] denote the space of linear combinations 
of commutators, that is, the space spanned by elements of the form 
[X, Y] with X,Y £ g. 

(a) Show that [g, g] is an ideal in g and that the quotient g/[g,g] 
is commutative. (The ideal [g, g] is called the commutator ideal 
of g.) 

(b) If g = so(3), show that [g, g] = g. 

(c) If 7r : g —»• gl(V) is any finite-dimensional representation of g, 
show that 7r([g,g]) is contained in sl(U), the space of endomor- 
phisms of V with trace zero. 

16. (a) Show that the Lie algebra pu(ra) = u(n)/{*aK} is isomorphic to 

the Lie algebra su(n). 

(b) Let { e 2mk / n l } denote the group of matrices that are of the form 
of an nth root of unity times the identity. Show that the group 
PU(n) is isomorphic to SU(n)/{e 27rlfc / n /}. 

17. Suppose that G is a matrix Lie group with Lie algebra g and that 
A is an element of G. Show that the operation of left multiplication 
by A~ 1 is a diffeomorphism of M„(C). Now show that there exist 
neighborhoods U of 0 in M n (C) and V of A in M n ( C) such that the 
map X i—^ Ae x maps U diffeomorphically onto V and such that for 
X G U, we have X £ g if and only if Ae x £ V. (Use Theorem 16.25.) 

18. Suppose that Z £ M n ( C) has the property that ZX = XZ for all 
X £ M n { C). Show that Z = cl for some c £ C. 

19. Suppose (II, H) is a unitary representation of a matrix Lie group 
G, and suppose V\ and Vi are finite-dimensional irreducible invari¬ 
ant subspaces of H. Show that if V\ and Vi are not isomorphic as 
representations of G, then V) is orthogonal to Vi inside H. 

Hint: Show that the orthogonal projection of H onto V\ or Vi is an 
intertwining map, and use Schur’s lemma. 


17 

Angular Momentum and Spin 


17.1 The Role of Angular Momentum 
in Quantum Mechanics 

Classically, angular momentum may be thought of as the Hamiltonian 
generator of rotations (Proposition 2.30). Angular momentum is a particu¬ 
larly useful concept when a system has rotational symmetry, since in that 
case the angular momentum is a conserved quantity (Proposition 2.18). 
Quantum mechanically, angular momentum is still the “generator” of ro¬ 
tations, meaning that it is the infinitesimal generator of a one-parameter 
group of unitary rotation operators, in the sense of Stone’s theorem (The¬ 
orem 10.15). The quantum angular momentum is again conserved in sys¬ 
tems with rotational symmetry. This means that if the Hamiltonian H is 
invariant under rotations, then H commutes with the angular momentum 
operators, in which case, the angular momentum operators are constants 
of motion in the quantum mechanical sense. 

The various components of the classical angular momentum vector for 
a particle in R 3 satisfy certain simple commutation relations under the 
Poisson bracket (Exercise 19 in Chap. 2). We will see that those relations are 
the commutation relations for the Lie algebra so(3) of the rotation group 
SO(3). If H commutes with each component of the angular momentum, 
each eigenspace for H (the solution space to Hip = \ip for a given A) is 
invariant under the angular momentum operators. Thus, the eigenspace 
constitutes a representation of the Lie algebra so(3). By classifying the 
irreducible (finite-dimensional) representations of so(3), we can obtain a lot 

B.C. Hall, Quantum Theory for Mathematicians , Graduate Texts 367 

in Mathematics 267, DOI 10.1007/978-l-4614-7116-5_17, 

© Springer Science+Business Media New York 2013 


368 


17. Angular Momentum and Spin 


of information about the structure of the solution spaces to the equation 
Hip = A ip, in the case that H is invariant under rotations. Specifically, the 
representation theory of so(3) allows us to determine completely the angular 
dependence of a solution ip(x), leaving only the radial dependence of ip to 
be determined. This has the effect of reducing the number of independent 
variables from three to one (just the radius r in polar coordinates), thereby 
reducing the problem to solving an ordinary differential equation. 

Understanding angular momentum from the point of view of representa¬ 
tions of a Lie algebra also prepares us to understand the concept of spin. 
The Hilbert space for a particle in M 3 with spin is the tensor product 
of L 2 (M 3 ) with a finite-dimensional vector space V. where V carries an 
irreducible action of the rotation group SO (3). In this setting, the proper 
notion of “action” is a projective representation of SO(3), meaning a family 
of operators satisfying the relations of S0(3) up to phase factors (constants 
of absolute value one). These phase factors are permitted because, physi¬ 
cally, two vectors that differ only by a constant represent the same physical 
state. By Proposition 16.46, every projective representation of S0(3) can 
be de-projectivized at the level of the Lie algebra so(3). Conversely, every 
irreducible ordinary representation of the Lie algebra so(3) gives rise to a 
representation of the universal cover SU(2) of S0(3), which in turn gives 
rise (Theorem 16.47) to a projective representation of S0(3). Thus, the 
possibilities for the space V are in one-to-one correspondence with the irre¬ 
ducible representations of the Lie algebra so(3). In the case of “half-integer 
spin,” the space V does not carry an ordinary representation of the group 

S0(3). 

17.2 The Angular Momentum Operators in M 3 

Recall from Sect. 2.4 that the classical angular momentum for a particle in 
M 3 is given by J = x x p, so that, say, J 3 = X\P 2 — X 2 Pi- As in Sect. 3.10, 
we introduce the quantum mechanical counterpart, a “vector” J with com¬ 
ponents that are operators, 


J = XxP. 

Thus, for example, Ji = X 2 P 3 — X 3 P 2 . Note that each component of the 
angular momentum involves products of distinct components of the po¬ 
sition and momentum operators X and P, which commute. Thus, in the 
expression for, say, J3, it does not matter whether we write X 2 P 3 or P3X2. 

The angular momentum operators are unbounded operators and are de¬ 
fined only on a dense subspace of L 2 (R 3 ). For the moment, we will not 
specify the domain of these operators, leaving that until the next section. 
We will see, however, that the domain of each angular momentum operator 
contains the Schwartz space <S(M 3 ) (Definition A.15). 
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As in Exercise 10 in Chap. 3, we can use the canonical commutation 
relations to obtain [Ji, Ji] = ihJ 3 . We may similarly compute [J 2 , J 3 ] and 
[ Ji, J 2 ] to obtain the complete set of commutation relations among the J’s: 

tv [A, A] =-A tAA, A] = A; tv[A) A] = A- 

in in in 

These relations compare well with the Poisson bracket relations among the 
various components of the classical angular momentum vector (Exercise 19 
in Chap. 2). 

Writing out J 3 explicitly, we have 


(AbX*) = ~ iH ( *1^ - X 2^ ) V'W 




6—0 


(17.1) 

(17.2) 


where Rg denotes a counterclockwise rotation by angle 9 in the (£ 1 , 2 : 2 ) 
plane, with similar expression for Ji and Ji- This description of the angu¬ 
lar momentum operators demonstrates that they—like the components of 
the classical angular momentum—are closely connected to rotations (recall 
Propositions 2.18 and 2.30). The connection between angular momentum 
and rotations will be made more explicit in the following sections by recog¬ 
nizing that they make up the Lie algebra action associated with the natural 
action of the rotation group on L 2 (R 3 ). 

We may define a new version of the angular momentum operators Jj, 
given by 

Jj = (17-3) 

Since Planck’s constant and angular momentum have the same units, the 
Jj’s do not depend on the choice of units; we refer to them as the dimen¬ 
sionless versions of the angular momentum operators. 


17.3 Angular Momentum from the Lie Algebra 
Point of View 

We begin this section by looking at the natural action of the rotation group 
SO(3) on L 2 (R 3 ). 

Definition 17.1 For each R £ SO(3), define II(i?) : L 2 (R 3 ) —> L 2 (K 3 ) by 

(n (R)ip)(x) = ^(iJ -1 *). (17.4) 

Proposition 17.2 For each R £ SO(3), the map II(i?) : L 2 (R 3 ) —> L 2 (R 3 ) 
is unitary. Furthermore, the map II : SO(3) —> U(L 2 (R 3 )) is a strongly 
continuous homomorphism. 
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Proof. Since the Lebesgue measure on R 3 is invariant under rotations, 
IL(R) is unitary for all R £ SO(3). It is easily checked that II(i?iI? 2 ) = 
n(i?i)n(i? 2 ); for this to be true, we need to have if^R^x) rather than 
if(Rx) in the definition of II(i?). Arguing as in the proof of Example 10.12, 
we can easily verify that II is strongly continuous. ■ 

Recall the computation of the Lie algebra so(3) of SO(3) in Sect. 16.5, 
and the basis {Fi, F 2 , F 3 } for so(3) in (16.2) in that section. 

Proposition 17.3 For each X £ so(3), let ir(X) denote the skew-self- 
adjoint operator such that 

U(e tx ) = e t ^ x \ (17.5) 

Then the domain of each ir(Fj) contains the Schwartz space iS(R 3 ) and on 
<S(R 3 ) we have the relation 

Jj = ihn(Fj). 

In the notation of Stone’s theorem (Theorem 10.15), the operator n(X) 
in (17.5) is i times the infinitesimal generator of the one-parameter unitary 
group 1 1 -)- W{e tx ). 

Proof. In the case of J 3 , we compute as in Example 16.16 that e tFs is a 
counterclockwise rotation in the (cci, X 2 )-plane. If if belongs to <S(R 3 ) then 
the limit defining the derivative in (17.2) is easily seen to hold in the L 2 
sense. Thus, recalling the inverse on the right-hand side of (17.4), we see 
that J 3 coincides with ihTr(F 3 ), as claimed. Similar calculations apply to 
Ji and J 2 . ■ 

Although it is not easy to determine the precise domain of each angular 
momentum operator, we can see from Proposition 16.54 that if if belongs 
to a finite-dimensional subspace of L 2 (R 3 ) that is invariant under rotations, 
then if belongs to the domain of each Jj. 


17.4 The Irreducible Representations of so(3) 

In this section, we classify the irreducible finite-dimensional representations 
of the Lie algebra so(3), up to isomorphism. (See Sect. 16.7 for the defini¬ 
tions and elementary properties of representations.) All representations are 
taken over the field of complex numbers and assumed to have dimension 
at least one. We continue to use the basis {F 3 , F 2 , F 3 } for so(3) in (16.2). 

Theorem 17.4 Let 1 r : so(3) —> gl(P) be a finite-dimensional irreducible 
representation of so(3). Define operators L + , L~, and L 3 on V by 

L + = Mr(Fi) — 7T (F 2 ) 

L~ = in(Fi) + 7 t(F 2 ) 

L 3 = mt(F 3 ). 
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Let l = i(dimI7 — 1), so that dim V = 21 + 1. Then there exists a basis 
vq, Vi,..., V21 ofV such that 


L 3 Vj = (l- j)vj 



(17.6) 



j(2l + l-j)v j _ 1 ifj> 0 
0 ifj = 0 


L+ 


Thus, the quantity l completely determines the structure of an irreducible 
representation of so(3). Since dim V is a positive integer, l has to have one 
of the following values: 



(17.7) 


The proof of Theorem 17.4 is given later in this section. 

Definition 17.5 If (7 r, V) is an irreducible finite-dimensional representa¬ 
tion of so(3), then the spin of (n, V) is the largest eigenvalue of the operator 
L 3 := iTr(F 3 ). Equivalently, l is the unique number such that dim V = 2Z + 1. 

Our next result says that all the values of l in (17.7) actually arise as 
spins of irreducible representations of so(3). 

Theorem 17.6 For any l = 0, 1, |,.. . there exists an irreducible repre¬ 

sentation of so(3) of dimension 21 + 1, and any two irreducible representa¬ 
tions of so(3) of dimension 21 + 1 are isomorphic. 

Note that the theorem is only asserting the existence, for each l, of a 
representation of the Lie algebra so(3). As we will see in the next section, 
an irreducible representation 7r of so(3) comes from a representation II of 
SO(3) if and only if l is an integer. Nevertheless, the representations of 
so(3) with half-integer values of l —the ones where l is half of an integer 
but not an integer—still play an important role in quantum physics, as 
discussed in Sect. 17.8. (Although it would be clearer to refer to the case 
l = 1/2, 3/2, 5/2,... as “integer plus a half,” the terminology “half-integer” 
is firmly established.) 

By comparison to Proposition 17.3, we may think of L 3 as the analog 
of the third component of the dimensionless angular momentum operator 
on the space V. Indeed, we will eventually be interested in applying Theo¬ 
rem 17.4 to the case in which V is a subspace of L 2 (R 3 ) that is invariant 
under the action of SO (3). In that case, L 3 will be precisely (the restriction 
to V of) the dimensionless angular momentum operator J 3 . 

Observe that Theorem 17.4 bears a strong similarity to our analysis of 
the quantum harmonic oscillator. In both cases, we have a “chain” of eigen¬ 
vectors for a certain operator, along with raising and lowering operators 
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that raise and lower the eigenvalue of that operator. In the case of the 
harmonic oscillator, we have a chain that begins with a ground state and 
then extends infinitely in one direction. In the case of so(3) representations, 
we have a chain that is finite in both directions. The chain begins with an 
eigenvector v 3 for L 3 with maximal eigenvalue, so that v 3 is annihilated 
by the raising operator L+ . A key step in the proof of Theorem 17.4 is to 
determine how the chain can terminate (in the direction of lower eigenval¬ 
ues for L 3 ) without violating the commutation relations among L 3 , L +, 
and L~. 

Proof of Theorem 17.4. Since tv is a Lie algebra homomorphism, the 
n(Fj)’s satisfy the same commutation relations as the Fj's themselves. 
From this we can easily verify the following relations among the operators 
L + , L~ , and L 3 : 


[L 3 ,L+}=L+ (17.8) 

[L 3 ,L-]=-L- (17.9) 

[L+,L~]=2L 3 . (17.10) 

Now, since we are working over the algebraically closed field C, the operator 
L 3 has at least one eigenvector v with eigenvalue A. Consider, then, L + v. 
Using (17.8), we compute that 

L 3 L+v = (L+L 3 + L+)v = L + { Xv) + L+v = (A + 1 )L+v. (17.11) 

Thus, either L+v = 0 or L+v is an eigenvector for L 3 with eigenvalue 
A + 1. We call L+ the “raising operator,” since it has the effect of raising 
the eigenvalue of L 3 by 1. 

If we apply L+ repeatedly to v, we obtain eigenvectors for L 3 with eigen¬ 
values increasing by 1 at each step, as long as we do not get the zero vector. 
Eventually, though, we must get 0, since the operator L 3 has only finitely 
many eigenvalues. Thus, there exists k > 0 such that {L+) k v ^ 0 but 
(. L+) k+1 v = 0. By applying (17.11) repeatedly, we see that {L+) k v is an 
eigenvector for L 3 with eigenvalue A + k. 

Let us now introduce the notation vq := (L+) k v and /.< = A + k. Then vo 
is a nonzero vector with L+v 0 = 0 and L 3 v 3 = [ivq. We now forget about 
the original vector v and eigenvalue A and consider only vq and /i. Define 
vectors Vj by 

v j = (L~) j v 0 , j = 0,1,2,.... 

Arguing as in (17.11), but using (17.9) in place of (17.8), we see that L~ 
has the effect of either lowering the eigenvalue of L 3 by 1 or of giving the 
zero vector. Thus, L 3 Vj = (// — j)vj. 

Next, we claim that for j > 1 we have 


L+Vj =j(2fj, + l-j)vj, 


j — 1,2, 3,..., 


(17.12) 
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which is easily proved by induction on j. using (17.10) (Exercise 2). Since, 
again, Lj, has only finitely many eigenvectors, Vj must eventually be zero. 
Thus, there exists some N > 0 such that vn ^ 0 but Vn+i = 0. Since 
vn+i = 0 , applying (17.12) with j = N gives 

0 = L + vn +i = (N + 1)(2 p — N)v n . 

Since Vn ^ 0 and N + 1 > 0, we must have (2 p — N) = 0. This means that 
p must equal N/ 2. 

Letting l = N/2 and putting p = N/2 = l, we have the formulas recorded 
in (17.6). Meanwhile, since the Vj’s are eigenvectors for L 3 with distinct 
eigenvalues, the Vj 's are automatically linearly independent. Furthermore, 
the span of the Vj 's is invariant under L + , L~, and L 3 , hence under all of 
so(3). Since V is assumed to be irreducible, the span of the Vj’s must be 
all of V. Thus, the Vj’s form a basis for V. The dimension of V is therefore 
equal to the number of Vj’s, which is iV + 1 = 2Z + 1. ■ 

Proof of Theorem 17.6. We construct V simply by defining a space 
V with basis vo, Vi ,..., V 21 and defining the action of so(3) by (17.6). It 
is a simple matter (Exercise 4) to check that L + , L~ , and L 3 , defined in 
this way, have the correct commutation relations, so that V is indeed a 
representation of so(3). 

It remains to show that V is irreducible. Suppose that W is an invariant 
subspace of V and that W {0}. We need to show that W = V. To 
this end, suppose that w is some nonzero element of W, which we can 
decompose as w = Ylf=o a i v i- Let jo be the largest index for which (ij is 
nonzero. According to the formula for L + in (17.6), applying L + to any 
of the vectors V\,... , V21 gives a nonzero multiple of the previous element 
in our chain. Thus, ( L + ) J0 w will be a nonzero multiple of vq. Since W 
is invariant, this means that i>o belongs to W. But then by applying L~ 
repeatedly, we see that Vj belongs to W for each j, so that W = V. 

Theorem 17.4 tells us that any irreducible representation of so(3) of di¬ 
mension 2Z + 1 has a basis as in (17.6). We can then construct an isomor¬ 
phism between any two irreducible representations by mapping this basis 
in one space to the corresponding basis in the other space. ■ 

In the rest of this section, we look at some additional properties of rep¬ 
resentations of so (3). 

Proposition 17.7 Let n : so(3) gl(E) be an irreducible representation 
of so(3). Then there exists an inner product on V , unique up to multiplica¬ 
tion by a constant, such that n(X) is skew-self-adjoint for all X £ so(3). 

Proof. Recalling how the operators L 3 , L + , and L~ are defined, we can 
see that the assertion that each ir(X), X £ so(3), is skew-self-adjoint is 
equivalent to the assertion that L 3 is self-adjoint and that L + and L~ 
are adjoints of each other. Since the Vj’s are eigenvectors for L 3 with dis¬ 
tinct eigenvalues, if L 3 is to be self-adjoint, the Vj’s must be orthogonal. 
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Conversely, if we have any inner product for which the Vj ’s are orthogonal, 
then L 3 will be self-adjoint, as is easily verified. 

It remains to investigate the consequences of the condition ( L + )* = L~. 
Assuming this condition, we compute that 

{vj,Vj} = = (vj-uL+L-Vj- 1 ) . 

But L + L~ = L~L + + 2 L 3 . Furthermore, L^Vj-i = (l — j + l)vj-i and 
L + Vj -1 = (j — 1) (2Z — j + 2)vj-i and, thus, 

{vj,Vj) = (vj-i.L+L-vj- 1 ) 

= C? - 1)(2 l-j + 2 ) {vj-x,L~Vj- 2 ) + 2 [l~j + 1) (pj— 1 , Vj—x) ■ 

Recalling that L~Vj -2 = Vj —1 and simplifying gives 

= j(2l - j + 1) {vj- (17.13) 

It is easy to see that if the Vj’s are orthogonal, then L + and L~ are adjoints 
of each other if and only if the normalization condition (17.13) holds for 
j = 1,2,..., 21. Since j(2l — j + 1 ) is positive for each such j, there is no 
obstruction to normalizing the Vj’s so that this condition holds, and so an 
inner product with the desired property exists. Since the only freedom of 
choice in defining the inner product is the normalization of vo, the inner 
product is unique up to multiplication by a constant. ■ 

Proposition 17.8 Suppose (n 7 V) is an irreducible representation of so(3) 
of dimension 21 + 1. Define the Casimir operator C„ G End(P) by the 
formula 

C„ =n(F 1 ) 2 +ir(F 2 ) 2 +n(F 3 ) 2 . 

Then for all v € V, we have 


C n v = —l (l + l)v. 


Proof. See Exercise 3. ■ 

If we look at the proof of Theorem 17.4, we see that the only place in 
which irreducibility was used is in showing that the span of vo,v±,... ,v 2 1 
is equal to V. We can therefore obtain the following result, which will be 
used in Sect. 17.9. 

Proposition 17.9 Let (n, V) be any finite-dimensional representation of 
so(3), not necessarily irreducible. Suppose Vq is a nonzero element ofV such 
that L + v 0 = 0 and L 3 V 0 = Xvq for some A € C. Then A is equal to a non¬ 
negative integer or half-integer l. Furthermore, the vectors vq, v\, ..., v 2 i 
defined by 

Vj = {L~yv 0 , j = 0 , 1 ,. ..,21, 

span an irreducible invariant subspace of V of dimension 21 + 1, and L + , 
L ~, and L$ act on these vectors according to the formulas in Theorem 17. f. 
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In general, given a finite-dimensional representation ( 7 r, V) of a Lie 
algebra and a nonzero vector vq £ V, we say that vo is a cyclic vec¬ 
tor for V if the smallest invariant subspace of V containing vq is all 
of V. In Proposition 17.9, the vector vo is certainly a cyclic vector for 
W := span(?;o,..., v^i). It should be noted, however, that a representation’s 
having a cyclic vector does not , in general, mean that the representation 
is irreducible (Exercise 5). Thus, the irreducibility of W is not the result 
of some general result about cyclic vectors, but holds only because of the 
assumed special properties of the vector vq. 


17.5 The Irreducible Representations of S0(3) 

Having classified the irreducible representations of the Lie algebra so(3), 
we now turn to the classification of the representations of the group SO (3). 
Since S0(3) is connected (Exercise 13 in Chap. 16), Proposition 16.39 tells 
us that a representation of S0(3) is irreducible if and only if the associated 
Lie algebra representation is irreducible, and that two representations of 
S0(3) are isomorphic if and only if the associated Lie algebra represen¬ 
tations are isomorphic. Thus, to classify the irreducible representations of 
S0(3) up to isomorphism, we merely have to determine which irreducible 
representations of the Lie algebra so(3) come from a representation of the 
group S0(3). 

Proposition 17.10 Letni : so(3) —> gl(H) be an irreducible representation 
of so(3), with spin l := |(dim V — 1). If l is an integer (i.e., if the dimension 
of V is odd), then there exists a representation n; : S0(3) —> GL(H) such 
that nj and 7 p are related as in Theorem 16.23. If l is a half-integer (i.e., 
if the dimension ofV is even) then no such representation H; exists. 

It follows from this result and Proposition 16.39 that the irreducible 
representations of the group SO (3) are precisely the n/’s for which l is an 
integer. 

Proof. If l is a half-integer, then L 3 is diagonal in the basis {vj}, with 
eigenvalues being half-integers. Thus, 

e 27rir i (F 3 ) __ e 2-KiL 3 _ _j 

(Here the ‘V’ in front of 7 p is the number n = 3.14 ....) On the other hand, 
by a simple modification of Example 16.16, we can see that the matrix 
F 3 £ so(3) satisfies e 2 ' KF ' 3 = I. Thus, if a corresponding representation nj 
of SO (3) existed, we would have 

n i(I) = n* (e 2lrF3 ) = e 2 ™‘ iF3) = -I, 


which is a contradiction. 
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If l is an integer, we make use of the isomorphism cf> between su(2) 
and so(3) described in the proof of Example 16.32, which maps the ba¬ 
sis {Ei, E 2l E 3 } of su(2) to the basis {F 3l F 2 , F 3 } of so(3). We obtain a 
representation tt[ of su(2) by setting 7r z '(X) = 7 r/(^(X)). Since SU(2) is sim¬ 
ply connected, Theorem 16.30 tell us that there is a representation IIJ of 
SU(2) related to 7r{ in the usual way. We then compute that 

n ; (-/) = n; (e 2 * Ei ) = e 2n ^ Ei) = e 2 ™ i{Fi) = e 2 ™ Ls = /, 

since the eigenvalues of L 3 are integers. 

Now, by Example 16.34, there is a surjective homomorphism $ from 
SU(2) onto SO(3) for which the associated Lie algebra homomorphism is </>, 
and ker4> = {/, —I}. Since the kernel of Ilj contains {/, —/}, the map Ilj 
factors through SO(3), giving a representation II; of SO(3) such that IIJ = 
II z o$. By Exercise 10 in Chap. 16, the associated Lie algebra representation 
ai of so(3) satisfies 7 r{ = cq o <^>, so that cq = 7i{ o </> _1 = 717. Thus, II; is the 
desired representation of SO(3). ■ 


17.6 Realizing the Representations Inside L 2 (S 2 ) 

In this section, we deviate from the traditional treatment in the physics lit¬ 
erature by thinking of the “spherical harmonics” as restrictions to the unit 
sphere of certain polynomials on K 3 , rather than describing the spherical 
harmonics in angular coordinates on the sphere. Our approach avoids some 
messy computations in polar coordinates and it also generalizes readily to 
higher dimensions. 

Recall from Sect. 17.3 that there is a natural unitary representation II : 
SO(3) —> L 2 (R 3 ) given by H(R)ip(x) = ^(R^x). In solving rotationally 
invariant problems such as the quantum hydrogen atom, it will be useful 
to understand the structure of finite-dimensional subspaces V of L 2 (R 3 ) 
such that V is invariant under II and such that the restriction of II to V is 
irreducible. 

If we write functions on R 3 in polar coordinates, then SO(3) acts only on 
the angle variables. Thus, it is useful to consider also the action of SO(3) 
on L 2 (S 2 ), given by the same formula as for L 2 (R 3 ), namely 

(n(i?)^)(x) = ^(R _ 1 x), x e s 2 . 

In computing the norm for L 2 (S 2 ), we use the surface area measure on 
S 2 , which is invariant under the action of SO (3). Once we have found 
invariant subspaces inside L 2 (S 2 ), it is a simple matter to produce invariant 
subspaces inside L 2 (R 3 ) as well, as we will see in the next section. 
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We will be interested in this section in harmonic polynomials on R 3 , that 
is, polynomials p satisfying A p = 0, where A is the Laplacian. Since we 
always consider representations over C, we allow these polynomials to have 
complex coefficients. 

Definition 17.11 Let l be a non-negative integer. Define a subspace Vi of 
L 2 (S 2 ) by setting Vi equal to the space of restrictions to S 2 of harmonic 
polynomials on R 3 that are homogeneous of degree l. Then Vi is called the 
space of spherical harmonics of degree l. 

Note that if p is a homogeneous polynomial on R 3 of some degree l, then 
the restriction of p to S 2 is identically zero only if p itself is identically zero. 
After all, if p is homogeneous of degree l and zero on S 2 , then 

p(x) = |x| l p(^ ) = 0 

for all x jtz 0, and hence, by continuity, for all x G R 3 . (By contrast, the 
nonzero, nonhomogeneous polynomial p(x) := x\ +x 2 +x 2 — 1 is identically 
zero on S 2 .) We are therefore free to shift back and forth between thinking 
of the elements of Vi as functions on S 2 or as functions on R 3 . 

It is well known that the Laplacian A commutes with rotations. It follows 
that each Vi is invariant under the action of the rotation group. We will 
eventually see that Vi is irreducible under this action. 

Every homogeneous polynomial of degree 0 or 1 is harmonic. Thus, Vo 
consists of the constant functions on S 2 and Vi is spanned by the restric¬ 
tions to S 2 of the functions xi, X 2 , and X 3 . Meanwhile, the space of homoge¬ 
neous polynomials of degree 2 is 6-dimensional, and the space of harmonic 
polynomials that are homogeneous of degree 2 is spanned by the following 
five polynomials: X 1 X 2 , X 2 X 3 , X 3 X 1 , x\ — and x\ — x 2 . (The polynomial 
x\ — is also harmonic, but it is just the sum x\ — x 2 , and x\ — £§.) 

Theorem 17.12 The spaces Vi have the following properties. 

1. Each Vi has dimension 21 + 1. 

2. Each Vi is invariant under the action of the rotation group and 
irreducible under this action. 

3. For l ^ m, the spaces Vi and V m are orthogonal in L 2 (S 2 ). 

f. The Hilbert space L 2 (S 2 ) decomposes as the orthogonal direct sum of 
the Vi’s, as l ranges over the non-negative integers. 

The remainder of this section will be devoted to the proof of 
Theorem 17.12. We proceed in a series of lemmas, along with some corol¬ 
laries of those lemmas. 
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Lemma 17.13 Let V denote the space of polynomials on R 3 with complex 
coefficients. There exists an inner product (•, •) on V with the property that 


{p,Aq) v = (x 2 p,q) v , 


where 


2 2 , 2,2 

X = x ± + x 2 + x 3 . 


Proof. Although it is possible to give a combinatorial construction of the 
desired inner product, we can also give an analytic construction. Every 
polynomial p on R 3 certainly has a holomorphic extension to C 3 , denoted 
Pc- We may define, then, 


(p,q)v= / Pc(z)< 7 c( 
J c 3 


= -|~| 2 / 2 


t 3/2 


dfz, 


which is nothing but the inner product of pc and qc as elements of the 
Segal-Bargmann space 'HL 2 {C 3 ,pi). According to Lemma 14.12, we have 


r _ d air e -N 2 /2 c r _ e”l z |-/2 

L K(z) av <z) ^7^ d2 = J c , w(z)«(z )^-1 


-| z | 2 /2 


for all p,q £ V and all j = 1,2,3. This relation means that 



(XjP , q) v , 


from which we readily obtain the desired property of our inner product. ■ 
A standard bit of elementary combinatorics shows that the number of 
ordered triples (h,l 2 ,h) with l\ + I 2 + Z 3 = l is equal to (/ + 2 )(Z + l)/ 2 . 
Since the monomials x l f x l f x l 3 with l\ + I 2 + ^3 = l form a basis for Vi, we 
have dimP; = (l + 2 )(7 + l)/ 2 . 

Corollary 17.14 If Vi denotes the space of polynomials on R 3 that are 
homogeneous of degree l , then the Laplacian A maps Vi onto Vi- 2 for all 
l > 2. Thus, for all l > 2, we have 


dim Vj = dim P; — dimP ;_ 2 

(Z + 2)(Z + 1) Z(Z-l) 
“2 2 
= 2Z + 1. 


Proof. Let us equip the finite-dimensional spaces P/ and P /_ 2 with the 
inner product from Lemma 17.13. It is easy to see that the statement, 
“The orthogonal complement of the image is the kernel of the adjoint,” 
applies to linear maps of one finite-dimensional inner product space to 
another. Applying this to A : P; —> P/_ 2 , we note that the adjoint of A is 
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multiplication by x 2 , which is clearly injective, since x\ + x\ + x 2 is zero 
only at the origin. Thus, the orthogonal complement of the image of A is 
{0}. Since the spaces are finite-dimensional, this means that A maps Vi 
onto P; _2 • ■ 

Corollary 17.15 Let l be a non-negative integer and let k = 1/2 if l is 
even and let k = (l — l)/2 if l is odd. Then each p £ Vi can be decomposed 
in the form 

p(x) =p 0 (x) + |x| 2 pi(x) + |x| 4 p 2 (x) H-b |x| 2fc p fc (x), 

where each pj (x) is a harmonic polynomial that is homogeneous of degree 
l — 2 j. In particular, the restriction of p to S 2 satisfies 

P\ S 2 = (PO +Pl~\ -h Pfc)|g2 , 

where po + pi + ■ ■ ■ + Pk is a (nonhomogeneous) harmonic polynomial. 

Given any polynomial p, not necessarily homogeneous, we can apply 
Corollary 17.15 to each homogeneous piece of p. We see, then, that given 
any polynomial p , there exists a harmonic polynomial p such that p and p 
have the same restriction to S 2 . 

Proof. We proceed by induction on l. If l = 0 or l = 1, then all p £ Vi 
are harmonic and the desired decomposition is simply p = po- Consider, 
then, some l > 2 and assume the result holds for all degrees less than l. 
Lemma 17.13 tells us that Vi decomposes as an orthogonal direct sum of 
the kernel of A and the image of V 1-2 under multiplication by |x| 2 . Thus, 
any p € Vi can be decomposed as p = po + |x| go, where po is harmonic 
and qo belongs to Vi- 2 - By induction, go has a decomposition of the desired 
form; substituting this in for go in the decomposition p = po + |x| 2 go gives 
the desired decomposition of p. ■ 

To show that Vi is irreducible under the action II of SO(3), we pass to 
the Lie algebra. Since, as we have remarked, restriction to the sphere is 
injective on homogeneous polynomials, we may think of the elements of Vj 
as polynomials on R 3 , in which case, the Lie algebra action n associated 
with II is given in terms of the usual angular momentum operators. 

Lemma 17.16 As in Theorem 17-4, let = iTr(Fs) = J 3 and let L + = 
z7r(Ti) — 71 (^ 2 ) = J\ + zJ 2 . For any non-negative integer l, the polynomial 
p(xi,X 2 ,X 3 ) := (xi + ix 2) 1 belongs to Vi and satisfies 

L 3 p = lp 


and 


L + p = 0. 


380 


17. Angular Momentum and Spin 


Proof. Since it is independent of x 3 and holomorphic as a function of 
z := x 1 + ix 2 , the polynomial p is automatically harmonic, which can also 
be verified by direct calculation. Meanwhile, applying L 3 to p gives 


. ( d d \ 

~ l [ Xl d^- X 2 d^J 

= —i [xil(x! + ix 2 ) l ~ 1 
= l(x 1 + ix 2 ) 1 . 


(xi + ix 2 ) 1 

(i) - x 2 l(x 1 + ix 2 ) i_1 ] 


Finally, applying L + := in(Fi) — n(F 2 ) to p gives 

f d d \ ( d d \ 

- I X 2 - - x 3 - p+ x 3 — - Xi —— p 

\ OX3 0X2 ) \ ox 1 OX 3 J 

= -i(—X 3 l{xi +ix 2 ) l ~ 1 (i)) +x 3 l(xi +ix 2 ) l ~ 1 ( 1) 

= 0 , 


as claimed. ■ 

Corollary 17.17 The space Vi is irreducible under the action of SO(3). 

Proof. By Proposition 17.9, if we apply L~ repeatedly to the polynomial 
p, we obtain a “chain” of eigenvectors of length 21 + 1. These eigenvectors 
span an irreducible invariant subspace of dimension 21 + 1. Since we have 
already established that dimV; = 21 + 1 , the elements of the chain must 
span Vi , which implies that Vi is irreducible. ■ 

We have now assembled all the pieces necessary for a proof of the main 
result of this section. 

Proof of Theorem 17.12. We have already proved Points 1 and 2 of the 
theorem in Corollaries 17.14 and 17.17, respectively. Now, each Vi is an 
irreducible representation of SO(3), and no two of the Vfs can be isomor¬ 
phic, because they all have different dimensions. Thus, by Exercise 19 in 
Chap. 16, Vi and V m must be orthogonal inside L 2 (S 2 ) for l ^ m, which is 
Point 3. 

Finally, by the Stone-Weierstrass theorem and the density results of 
Theorem A. 10, the restrictions to S 2 of polynomials on R 3 form a dense 
subspace of L 2 (S 2 ). But Corollary 17.15 shows that the space of restric¬ 
tions to S 2 of polynomials coincides with the space of restrictions to S 2 
of harmonic polynomials. Thus, the span of the Vj ’s is dense in L 2 (S 2 ), 
establishing Point 4. ■ 


17.7 Realizing the Representations Inside L 2 (M 3 ) 

Recall that for homogeneous polynomials on M 3 , the restriction map from 
R 3 to S 2 is injective. Thus, we may think of the space Vi equally well as 
a space of functions on S 2 (as in the previous section) or as a space of 
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functions on M 3 . In this section, then, we will let Vi denote the space of 
harmonic polynomials on M 3 that are homogeneous of degree l. 

Definition 17.18 Suppose l is a non-negative integer and f is a measur¬ 
able function on (0, oo) such that 

|/(r)| 2 r 2i+2 dr < oo. (17.14) 

Let Vij C L 2 (R 3 ) denote the space of functions if of the form 

V’(x) =p(x)/(|x|), (17.15) 



where p S V). 

The condition on f(r) is precisely what one needs to make iffx) a square- 
integrable function on R 3 (compute the L 2 norm in spherical coordinates). 

Definition 17.18 is not the one that physicists typically use. In the physics 
literature, one sees a functions of the form 

^(x) = Yi m {e,4>)g(r), (17.16) 

where r, 9 , and (f> are the usual spherical coordinates. Here Y) m is the re¬ 
striction to the sphere of a particular harmonic polynomial that is homoge¬ 
neous of degree l, written in spherical coordinates. (Up to a normalization 
factor, the Yj m ’s are obtained by using the basis for Vi in Theorem 17.4.) 
Thus, if we move along a ray from the origin in M 3 , only the value of g(r ) 
changes. By contrast, in (17.15), as we move along a ray, the p(x) factor 
contributes a factor of r l . We can write the physics expression in rectangular 
coordinates as 


V>(x) = Y lm ^ '- ) S(l x l) 

= y Im( x)£fW (i7.i7) 

M 

For computational purposes, the expression (17.15) is more convenient 
than (17.17); in fact, in the analysis of the hydrogen atom, physicists mul¬ 
tiply by r l at some later point in the calculation, just so that the relevant 
differential equation will take on a simpler form. 

Proposition 17.19 Every space of the form Vij C L 2 (R 3 ) is invari¬ 
ant and irreducible under the action of SO(3). Conversely, every finite¬ 
dimensional, irreducible, SO(3) -invariant subspace o/L 2 (R 3 ) is of the form 
Vij for some non-negative integer l and some f satisfying (17.1 f). 
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Proof. Since the factor /(|x|) is invariant under rotations, the action of 
S0(3) only affects the function p. Thus, V/ j is isomorphic, as a represen¬ 
tation of S0(3), to the space Vj, which is irreducible by Theorem 17.12. 

For the other direction, the Lebesgue measure on R 3 decomposes as a 
product of the surface area measure on S 2 with the measure 47rr 2 dr on 
(0, oo). Thus, by a standard measure-theoretic result (Proposition 19.12), 
L 2 (R 3 ) decomposes canonically as the Hilbert tensor product of L 2 (S 2 ) 
and T 2 (( 0, oo)), where a vector of the form f ®g in the tensor product cor¬ 
responds to the function f(9,<p)g(r) in L 2 (R 3 ), as in (17.16). Since L 2 {S 2 ) 
decomposes (Theorem 17.12) as the sum of the spaces Vi, l = 0,1, 2,..., 
we can decompose L 2 (R 3 ) as sum of spaces of the form 

Vi, k := Vi ® gu, 

where the g k s form an orthonormal basis for L 2 ((0,oo)). 

Now, let V be any finite-dimensional, irreducible, SO(3)-invariant 
subspace of L 2 (R 3 ). Let iri t k ■ T 2 (R 3 ) —> Vi, k be the orthogonal projec¬ 
tion operator, and let pi tk be the restriction of ir^k to V. This map is easily 
seen to be an intertwining map for the action of SO(3). Thus, since both V 
and Vi k are irreducible, Schur’s lemma tells us that each pi k is either zero 
or an isomorphism. Furthermore, since the spaces Vi t k are nonisomorphic 
for different values of l, we cannot have both pkj and pk',v being nonzero 
for l V. On the other hand, pk,i cannot be zero for all k and l, since the 
Vk,i s span L 2 (R 3 ). Thus, there must be some value Iq of l such that pi 0 ,k 0 
is nonzero for some ko but such that pi }k = 0 for all l ^ Iq. 

Applying Schur’s lemma again, we see that p; 0 ,fc(pj 0 ,fc 0 ) _1 must be of the 
form Cfc/ for each k. Given any ip V. let v be the unique element of V 
such that pj o ,fc o ('0) = v 0 gk 0 - Then we have 

PJo.fc(V’) = c k (v ® g k ) 

for every k. Since also pi t k{tp) = 0 for l ^ Iq, we conclude that ip must be 
of the form v ® g, where 

9 ~ ^ ^ c k g k - 

k 

Since this holds for each ip € V (with the same set of constants Cfc), we see 
that V = Vi 0 ® g , which is nothing but the form in (17.16). Then V is of 
the form claimed in the proposition, where /(r) = g(r)/r l °. ■ 

It can further be shown that each closed, SO(3)-invariant subspace of 
L 2 (K 3 ) decomposes as an orthogonal direct sum of finite-dimensional, ir¬ 
reducible, SO(3)-invariant subspaces. This result is just a special case of a 
general result for strongly continuous unitary representations of compact 
topological groups. (See, e.g., Chap. 5 of [10].) Since we already know that 
L 2 (R 3 ) is a direct sum of finite-dimensional, irreducible invariant subspaces, 
it is probably possible to give an elementary proof of this result, but we 
will not pursue that approach here. 
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We classified irreducible finite-dimensional representations of the Lie 
algebra so(3) by their “spin” l, where l is the largest eigenvalue for the 
operator = m{Fz). The possible values for l are non-negative integers 
(0,1,2,.. .) and the positive half-integers (1/2, 3/2,.. .). Inside L 2 (S 2 ) and 
L 2 (R 3 ), however, we found only irreducible representations of so (3) with 
integer spin. It is easy to understand why the half-integer spin represen¬ 
tations do not occur: They do not correspond to any representation of the 
group SO(3). Since L 2 {S 2 ) and L 2 ( R 3 ) both carry a natural unitary action 
II of the group SO(3), any finite-dimensional subspace that is invariant un¬ 
der the associated Lie algebra representation ir will also be invariant under 
II and thus constitute a representation of SO(3). 

Although the half-integer representations 7q of the Lie algebra so(3) can¬ 
not be exponentiated to representations of SO(3), they can be exponenti¬ 
ated to representations of the universal cover SU(2) of SO(3), as in the proof 
of Proposition 17.10. For a half-integer l, the associated representation II( of 
SU(2) satisfies II((— I) = — I , which means that II( does not factor through 
SO(3) = SU(2)/{7, —I}. If, however, we think about projective representa¬ 
tions, we see that [— I] is the identity element in PU(V). Thus, even when l 
is a half-integer, we get a well-defined projective representation II; of SO(3) 
that satisfies 

I h(e tx ) = [e^ x) ] 

for all X £ so(3), where [U] denotes the image of U £ U(V) in PU(V’). 

It is generally believed that the physics of the universe is invariant under 
the rotation group SO(3). This does not mean that one never considers 
models without rotational symmetry, because the local environment of, 
say, a hydrogen atom in a magnetic field breaks the rotational symmetry of 
the hydrogen atom. Nevertheless, if we were to rotation both the hydrogen 
atom and the magnetic field, the physics of the problem would not change. 
In quantum mechanics, rotational symmetry means that there should be 
a projective unitary representation of SO (3) on the Hilbert space of the 
universe that commutes with the Hamiltonian operator. Now, the Hilbert 
space of the universe (if there is such a thing) is built up out of Hilbert 
spaces for each type of particle. Thus, we expect that the Hilbert space 
for a single particle will also carry a projective unitary representation of 
SO(3). 

The simplest possibility for the Hilbert space of a single particle is the 
Hilbert space L 2 (K 3 ), which certainly carries an (ordinary) unitary action 
of SO(3), as we have been discussing in this chapter. Based on various ex¬ 
perimental observations, however, physicists have proposed a modification 
to the Hilbert space for an individual particle that incorporates “inter¬ 
nal degrees of freedom.” The proposal is that for each type of particle, 
the quantum Hilbert space should be of the form Z/ 2 (M 3 )®V, where V 
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is a finite-dimensional Hilbert space that carries an irreducible projective 
unitary representation of SO(3). Here <g) is the Hilbert tensor product (Ap¬ 
pendix A.4. 5). The (projective) action of S0(3) on V describes the action 
of the rotation group on the internal degrees of freedom of the particle. 

Now, according to Proposition 16.46, the space V carries a (trace-zero) 
ordinary representation n of the Lie algebra so(3). In customary physics 
terminology, the largest eigenvalue l of the operator L 3 := iir^Ff) in V is 
then called the spin of the particle. We then denote the space V by V) to 
indicate the value of the spin. Electrons, for example, are “spin 1/2” par¬ 
ticles, meaning that the Hilbert space for a single electron is L 2 (K 3 )®L r 1 / 2 , 
where V 1/2 is a two-dimensional projective representation of SO(3). 

It is easy to see that the tensor product of two projective unitary repre¬ 
sentations of a given group is again a projective unitary representation of 
that group. (By contrast, the direct sum of two projective unitary repre¬ 
sentations is in general not again a projective unitary representation.) In 
the case at hand, we can think of L 2 (R 3 ) as carrying a unitary representa¬ 
tion n of SU(2) that factors through S0(3), that is, for which n(—7) = I. 
Meanwhile, we can think of Vi as a carrying a unitary representation 
of SU(2) in which n/(—7) = ±7, with the plus sign if l is an integer and 
the minus sign if l is a half-integer. Thus, L 2 (R 3 )®Vj carries a unitary rep¬ 
resentation n ® n* of SU(2) in which (n ® n;)(—7) = ±7. Thus, in the 
projective sense, n ® n; factors through S0(3). 

Summary 17.20 (Spin) Each type of particle has a “spin ” l, which is a 
non-negative integer or half-integer. The Hilbert space for such a particle 
is L 2 (R 3 )( 8 )Vi, where Vi is an irreducible projective representation of S0(3) 
of dimension 21 + 1. 

Since V) is finite dimensional, the Hilbert tensor product L 2 (R 3 )®V; co¬ 
incides with the algebraic tensor product of L 2 (R 3 ) with V). 

Definition 17.21 A particle for which the spin is an integer is called a bo¬ 
son , and a particle for which the spin is a half-integer is called a fermion. 

To see the significance of the distinction between integer and half-integer 
spin, one needs to look at the structure of the Hilbert space describing 
multiple particles of a given type, such as the Hilbert space for five electrons. 
This topic is discussed in Chap. 19. 


17.9 Tensor Products of Representations: 

“Addition of Angular Momentum” 

Let Vi and V m be irreducible representations of so (3) with dimensions 21 + 1 
and 2m + 1, respectively. As discussed in Sect. 16.8, the tensor product 
space Vi ® Vm can be viewed as another representation of so(3). Unless 
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one of l and m is zero, Vi 0 V m is not irreducible. It is of interest, then, 
to decompose VJ 0 V m as a direct sum of irreducible invariant subspaces. 
This decomposition- in the case that V) is an irreducible S0(3)-invariant 
subspace of L 2 (R 3 ) and V m is the space of internal degrees of freedom of a 
particle—will help us in decomposing the Hilbert space for a particle with 
spin into irreducible, S0(3)-invariant subspaces. 

Proposition 17.22 Let V \/2 be an irreducible representation of so(3) of 
dimension 2, and let V) be an irreducible representation of so(3) of dimen¬ 
sion 21 + 1, where l is a non-negative integer or half-integer. If l = 0, 
Vi 0 V 1/2 is irreducible. If l > 0, then we have 

Vl 0 Vi /2 — Vl+ 1/2 © Vl- 1 / 2 ) 

where “= ” denotes an isomorphism of representations. 

Proof. If l = 0, then it is easy to see that Vi 0 V \/2 is isomorphic to V\/ 2 , 
which is irreducible. Assume, then, that l > 0. 

Let L + , L~ , and L 3 be the operators in Theorem 17.4, constructed using 
the representation 7 rj, and let <j + , er~, and (T 3 be the analogous operators 
constructed using the representation tti/ 2 - As in Sect. 16.8, we define oper¬ 
ators J + , J~ , and J 3 on V) 0 Vi / 2 by 

J+ = L + 0 J + / 0 cr + 

J~ = L~ 0 / + 1 0 a~ (17.18) 

J 3 = L3 0 I + I 0 <73. 

Let {v 0 ,... ,V 21 } be a basis for V) as in Theorem 17.4, and let {eo,ei} be 
a similar basis for V) / 2 . Then the vectors of the form Vj 0 ek form a basis 
for Vi 0 V[/ 2 - The eigenvalues of J 3 are the numbers of the form 

j = 0,1,... ,21, k = 0,1. Thus, the eigenvalues of J 3 range from l + 1/2 to 
— (Z + 1/2). The numbers l + 1/2 and — (Z + 1/2) occur as eigenvalues only 
once. All other eigenvalues A occur twice, once as (A — 1/2) + 1/2 and once 
as (A + 1/2) - 1/2. 

The vector vq 0 eo is an eigenvector for J 3 with the largest possible 
eigenvalue l + 1/2, so that J + (v 0 0 eo) = 0. According to Proposition 17.9, 
if we apply J~ repeatedly, we will obtain a “chain” of eigenvectors of length 
2Z + 2, and the span of these vectors forms an irreducible invariant subspace 
Wo isomorphic to l/ +1 / 2 . 

Now, by Proposition 17.7, there exist inner products on V) and Vi/ 2 
that make 7 p and 7 r]/ 2 “unitary,” meaning that 7r(Af)* = — 7 r(X) for all 
X £ so(3). If we use on V) 0 Viy 2 the natural inner product, obtained from 
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the inner products on V) and V\ji as in Appendix A.4.5, then 7 r; © 71^/2 is 
also unitary. Thus, the orthogonal complement of the invariant subspace 
Wq is also invariant. Since all eigenvalues for J 3 except the largest and 
smallest have multiplicity 2 , we see that the largest eigenvalue for J 3 in 
Wq is 1 — 1/2. Let w 0 G Wq be an eigenvector for J 3 with eigenvalue 
l — 1/2. If we repeatedly apply the lowering operator J~ : L~ © I + I © a~ 
to Wq, we will obtain a chain of eigenvectors of length 21. These eigenvectors 
span an irreducible invariant subspace W\ of Vj©V 1/2 of dimension 21. Since 

dim Wq + dim W\ = 41 + 2 = dim(V) © Vi/ 2 ), 

we must have W\ = Wq-, completing the proof. ■ 

Since an electron is a “spin 1/2” particle, the Hilbert space for a single 
electron is, according to Sect. 17.8, -L 2 (R 3 )®Vi/ 2 , where Vi / 2 is an irre¬ 
ducible projective unitary representation of S0(3) of dimension 2. Mean¬ 
while, in Sect. 17.7, we saw how to find irreducible, S0(3)-invariant sub¬ 
spaces Vij of L 2 (R 3 ) of dimension 21 + 1 , for 1 = 0 , 1 , 2 ,..., where / is 
an arbitrary radial function. By applying Proposition 17.22 to the case 
Vi = Vij , we obtain irreducible SO(3)-invariant subspaces of the Hilbert 
space L 2 (K 3 )®H 1/2 . Finding such subspaces is essential in, for example, 
analyzing the fine structure of the hydrogen atom. 

In the case that Vi is an SO(3)-invariant subspace of L 2 (R 3 ), the for¬ 
mula for, say, the operator J 3 in (17.18) 17.22 is written in the physics 
literature as 

J 3 =L 3 + a 3 , (17.19) 

where it is understood that L 3 acts on the first factor in the tensor prod¬ 
uct and a 3 acts on the second factor. (That is to say, the tensor product 
with the identity operator is understood and thus not written.) Here L 3 is 
the ordinary angular momentum operator and cr 3 describes the action of 
the basis element F 3 € so(3) on the space V)/ 2 - Formulas such as (17.19) 
account for the physics terminology “addition of angular momentum” to 
describe the analysis of tensor products of representations of so(3). In this 
context, the operator L 3 (= L 3 ®J) is called an orbital angular momentum 
operator, and the operator cr 3 (= J®cr 3 ) is called a spin angular momentum 
operator, and similarly for L± and a ± . 

We now record the general result for tensor products of irreducible rep¬ 
resentations of so(3). 

Proposition 17.23 For any j = 0,1/2,1,..., let Vj denote the unique 
irreducible representation of so(3) of dimension 2j + 1. Then for any l and 
m with l > m, we have 

Vl ® Vm — Vl+m © Vl + m -1 © ■ ■ • © © Vj_ m . (17.20) 

The proof of this result is similar to that of Proposition 17.22, and is 
omitted; see Theorem D.l in Appendix D of [21]. An important property 
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of this decomposition is that each irreducible representation that occurs 
on the right-hand side of (17.20) occurs only once. This property of the 
representations of so(3) is the key idea in the proof of the Wigner-Eckart 
theorem. See Appendix D of [21] for details. 


17.10 Vectors and Vector Operators 

Definition 17.24 A function c:I 3 xt 3 ->R 3 is said to transform like 
a vector if 

c(Rx, Rp) = R( c(x, p)) (17.21) 

for all R € SO(3). 

In the physics literature, the expression “is a vector” is sometimes used 
in place of “transforms like a vector.” 

Note that in Definition 17.24, we only consider the transformation prop¬ 
erty of c under elements of SO(3) rather than under a general element of 
0(3). If c transforms like a vector, one says that c is an “true vector” if c 
satisfies (17.21) for all R in 0(3) [not just in SO(3)] and one says that c is a 
“pseudovector” if c satisfies c(Rx, Rp) = —R(c(x, p)) for R € 0(3)\S0(3). 
For our purposes, it is not necessary to distinguish between true vectors 
and pseudovectors. 

The position function Ci(x, p) := x, the momentum function C 2 (x, p) := 
p, and the angular momentum function c 3 (x, p) := x x p are simple exam¬ 
ples of functions that transform like vectors. (Transformation under rota¬ 
tions is one of the standard properties of the cross product.) A typical ex¬ 
ample of a function transforming like a vector is c(x, p) = (x-p) |x| (x x p). 

Proposition 17.25 Let j(x, p) = x x p denote the angular momentum 
function on K 3 x 1R 3 ' Suppose a smooth function c : R 3 x l 3 -> B 3 trans¬ 
forms like a vector. Then we have 


{Cfe, jfc} = 0 (17.22) 

for k = 1,2,3. Furthermore, we have 

{Ci,j 2 } = {ji,c 2 } = C 3 (17.23) 

and other relations obtained from (17.23) by cyclically permuting the 
indices. 

Proof. Let R{9) denote a counterclockwise rotation by angle 6 in the 
(xi,x 2 )-pl ane - Applying (17.21) with R = R(9) and looking only at the 
first component of the vectors, we have 


Ci(R(6)x 1 R(9)p) = ci(x,p)cos 0 — C 2 (x,p)sin 0 . 


(17.24) 
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Now, as in the proof of Proposition 2.30, the Poisson bracket {c\. j 3 ) is 
precisely the derivative of the left-hand side of (17.24) with respect to 9 , 
evaluated at 0 = 0. Thus, 


{ci,J3> = -C2 

and so {j$, ci} = C 2 , which is one of the relations obtained from (17.23) by 
cyclically permuting the indices. 

Meanwhile, if we again apply (17.21) with R = R(9) but look now at the 
third component of the vectors, we have that 

c 3 (i?(0)x, tf(0)p) = c 3 (x, p). 

Differentiating this relation with respect to 6 at 8 = 0 gives {c 3 ,j 3 } = 0. 
All other brackets are computed similarly. ■ 

We now turn to the quantum counterpart of a function that transforms 
like a vector. 

Definition 17.26 For any ordered triple C := (C±, C 2 , C 3 ) of operators 
on L 2 (R 3 ) and any vector v £ R 3 , let v • C be the operator 

3 

v-C = (17.25) 

j=i 

Then an ordered triple C of operators on L 2 (R 3 ) is called a vector oper¬ 
ator if 

(i?v)-C = n(i?)(v-C)n(i?)- 1 (17.26) 

for all R £ SO(3). 

Here n(-) is the natural unitary action of SO(3) on L 2 (R 3 ) in Defini¬ 
tion 17.1. Let us try to understand what this definition is saying in the 
case of, say, the angular momentum, which is (as we shall see) a vector op¬ 
erator. The operators Ji, J 2 , and J 3 represent the components of J in the 
directions of ei, e 2 , and e 3 , respectively. More generally, we can consider 
the component of J in the direction of any unit vector v, which will be 
nothing but v- J, as defined in (17.25). Since there is no preferred direction 
in space, we expect that for any two unit vectors v 3 and V 2 , the operators 
vi • J and V 2 • J should be “the same operator, up to rotation.” Specifically, 
if R is some rotation with Rv 1 = V 2 , then vi • J and V 2 • J should differ 
only by the action of R on the Hilbert space L 2 (R 3 ). But this is precisely 
what (17.26) says, with v = v 3 and C = J: 

v 2 -J = n(i7)(v 1 -J)n(R)- 1 

We will not concern ourselves with the question of whether (17.26) 
continues to hold for R £ 0(3)\S0(3). The position and momentum opera¬ 
tors X and P are easily seen to be vector operators. As in the classical case, 
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the cross product of two vector operators is again a vector operator. (See 
Exercise 7 in Chap. 18.) In particular, the angular momentum, J = X x P 
is a vector operator. 

If the operators C\, C 2 , and C 3 are unbounded, we should say something 
in Definition 17.26 about the domains of the operators in question. The sim¬ 
plest approach is to find some dense subspace V of L 2 (R 3 ) that is contained 
in the domain of each Cj and such that V is invariant under rotations. In 
that case, the equality in (17.26) is understood to hold when applied to a 
vector in V. In many cases, we can take V to be the Schwartz space <S(R 3 ). 
I 11 the following proposition, the space V should satisfy certain technical 
domain conditions that permit differentiation of (17.29) when applied to a 
vector ip in V. We will not pursue the details of such conditions here. 

Proposition 17.27 If C is a vector operator, then the components of C 
satisfy 

^[C j ,J i }=0 (17.27) 

for j = 1,2,3. Furthermore, we have 

^[Ci,J2] = ~[Ji,C 2 ]=C 3 , (17.28) 

and other relations obtained from (17.28) by cyclically permuting the 
indices. 

Proof. As in the proof of Proposition 17.25, R(9) denote a rotation in the 
(aq,X 2 )-plane, and let ei = (1,0,0). Applying (17.26) with R = R(Q) and 
v = ei, we have 

n(i?( 6 »))C'in(i?( 6'))" 1 = Ci cos 6 »+ C 2 sin(9. (17.29) 

But R(&) = e eF3 , where {Fj} is the basis for so(3) described in Sect. 16.5. 
Thus, differentiating (17.29) with respect to 6 at 9 = 0 gives 

tt(F 3 )Ci — Citt(F 3 ) = C 2 . 

Since J 3 = ihn(F 3 ) (Proposition 17.3), we obtain {1/(ih))[J 3 ,C\] = C 2 , 
which is one of the relations obtained from (17.28) by cyclically permuting 
the variables. 

Meanwhile, applying (17.26) with R = R{9) and v = e 3 gives 

n(i?( 0 ))c 3 n(i?( 0))- 1 = c 3 . 

Differentiating this relation with respect to 9 at 9 = 0 gives [^(^ 3 ), C 3 ] = 0. 
All other relations are obtained similarly. ■ 

For more information about vector operators, including the Wigner- 
Eckart theorem, see Appendix D of [21]. See also Exercise 7. 
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17.11 Exercises 

1. Verify the expression (17.2) for the vector held x\d/dx 2 — X 2 d/dx\. 

2. Verify the relation (17.12) in the proof of Theorem 17.4, using induc¬ 
tion on j and the commutation relation (17.10). 

3. This exercise provides a proof of Proposition 17.8. Let ( 7 r, V;) denote 
an irreducible representation of so(3) of dimension 21 + 1 and let C w 
denote the Casimir operator as defined in the proposition. 

(a) Show that [ir(Fj), C„] = 0 for all j = 1, 2,3. 

(b) Using Schur’s lemma, show that there is some A £ C such that 
C n v = Xv for all v £ V. 

(c) Show that 

Cn = — (L 3 + L L + + L 3 ) , 
where L + , L~ , and L 3 are as in Theorem 17.4. 

(d) By computing C„ on some suitably chosen vector in V, show 
that the constant A in Part (b) has the value —1(1 + 1). 

4. Let l be any non-negative integer or half-integer. Construct a vec¬ 
tor space V by decreeing that vectors {vo, v ±,..., V 21 } form a basis 
for V. Define operators L + , L~ , and L 3 on V by the expressions 
in (17.6). Show that these operators satisfy the commutation rela¬ 
tions (17.8), (17.9), and (17.10). 

Hint: In the case of L~ , treat the vector V 21 separately from the other 
basis vectors. In the case of the L + , treat the vector vq separately 
from the other basis vectors. 

5. Let (n, V ) be an irreducible representation of so(3) of dimension 2, 
with basis {^ 0 ,^ 1 } as in (17.6). Consider V © V as a representation 
of so(3) as in Sect. 16.8. Let v = ( vo,vi ). Show that the smallest 
invariant subspace of V ® V containing v is V ® V. 

Note: This shows that V © V has a cyclic vector, even though V ® V 
is not irreducible. 

6 . Compute explicit bases for the two irreducible invariant subspaces 
Wq = V 0/2 and W^- = V) / 2 of V\ ®V\/ 2 - Each basis element for Wo 
or W, g- should be expressed as a linear combination of the elements 
Vj ® ek in the proof of Proposition 17.22. 

7. Let Vi, V m , and V n be irreducible representation of so(3) of dimension 
21 + 1, 2m + 1, and 2n + 1, respectively. Suppose that 4> and are 
nonzero intertwining maps of Vi into V m ® V n . Show that $ = c'!' for 
some c € C. 
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Hint: Use Proposition 17.23 and Schur’s lemma. 

Note: This result is closely related to the Wigner-Eckart theorem for 
“irreducible tensor operators.” 


18 

Radial Potentials and the Hydrogen 
Atom 


18.1 Radial Potentials 


If V is any radial function on M 3 , let H = —(h 2 /(2m))A + V be the 
corresponding Hamiltonian operator, acting on L 2 (R 3 ). We will look for 
solutions to the time-independent Schrodinger equation Hip = Eip of the 
form ip(x) = p(x)/(|x|), where / is a smooth function on ( 0 , oo) and p is a 
harmonic polynomial on M 3 that is homogeneous of degree l. 

Proposition 18.1 Let p be a harmonic polynomial on M 3 that is homoge¬ 
neous of degree l and let f be a smooth function on (0,oo). Let ip be the 
function on M 3 \{0} given by 


d(x) =p(x)/(|x|). 


Then on R 3 \{0} we have 


A V>(x) = p(x) 


'ff 

dr 2 


2 {l + 1 ) df 

r dr 


(18.1) 


Proof. We begin with the case l = 0, so that p is a constant—which we 
take to be 1 —and ip is just the radial function /(|x|). Then 


d 

dxj 


/(M) 


dr dxj 
dr |x| 
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and so 


E 


3 = 1 


9X 1 


/( W)=E 


3 =1 


d 2 / z 2 

dr 2 | x | 2 


dr y |x| 



= ( ll + l d L 

dr 2 r dr 

For the general case, the product rule for the Laplacian gives 


= (Ap)/(|x|) + 2Vp ■ V/(|x|) + pA/(|x|). 


Now, Ap = 0 by assumption. Furthermore, since /(|x|) is radial, its gra¬ 
dient points in the radial direction. Thus, only the radial component of 
Vp is relevant. Moreover, on each ray through the origin, p behaves like a 
constant times r l . Thus, the r-derivative of p is ( l/r)p , giving 


A if 


21 df d 2 f 2 df 

7 P Tr +P d^ + r P Tr' 


which simplifies to the desired expression. ■ 

Although the decomposition of functions in Definition 17.18 is for many 
purposes the most convenient one, it is not quite the customary way of turn¬ 
ing spherical harmonics into functions on M 3 . Conventionally, one works in 
polar coordinates and considers functions of the form 


V’(r,d, < />) = p(d, <p)g(r), 

where p is the restriction to S 2 of an element of VJ. We can express this 
decomposition in rectangular coordinates as 

^(x) - /’ ( ) d(|x|) = ^Es(|x|). 

We can then obtain a more customary form of Proposition 18.1 as follows. 

Proposition 18.2 Suppose p € Vi and f is a smooth function on (0,oo), 
and let if by the function on M 3 \{0} given by 


^( x ) = P ( TW ) d(l 


Then 


(A ip)(rx) — p(x) 


d 2 g 2 dg l{l + 1) 

dr 2 r dr r 2 


9(r) 


for all xeS 2 and r € ( 0 ,oo). 


(18.2) 








18.1 Radial Potentials 


395 


Proof. Since p is homogeneous of degree l, 

p(x) 

d ' 


V T-r = 


Thus, 

V’(x) = p(x) 

Applying Proposition 18.1 gives 


/(|x|) 


Aip(x) = p(x) 


dr 2 


2(1 + 1) d 
r dr 



From here it is straightforward but unilluminating calculation to verify the 
formula in the proposition. ■ 

Still another way to write functions on M 3 is in the form 

t/>(x) = M|x|), (18.3) 

so that h(r) — rg(r). If we replace g(r) by h(r)/r in (18.2), we obtain, after 
a short calculation, 


(A i))(rx) 



d 2 h 

dr 2 


l (l + 1 ) 

^2 



x e S 2 . 


(18.4) 


Writing wave functions in the form (18.3) is convenient because we then 
have, for any radial potential, 


2 m 


Aip + V(\x\)ip 



h 2 d 2 h 
2m dr 2 


V e s(r)h(r) 


(18.5) 


where V e g is the effective potential given by 

Kff(r) =V(r)+ hl ^ r2 l) - (18-6) 


Note that the quantity in square brackets in (18.5) is just an ordinary one¬ 
dimensional Schrodinger operator, since the first derivative term in (18.2) 
has been eliminated. Despite the naturalness of the form (18.3), it is the 
form (18.1) that is ultimately most convenient for finding the bound states 
of the hydrogen atom Hamiltonian. 

Now, as the discussion following Proposition 9.34 illustrates, even if ip 
is square-integrable over R 3 \{0} and A ip is square-integrable over R 3 \{0}, 
ip may not be in the domain of the Laplacian, since the distributional 
Laplacian of ip may contain a term that is supported at the origin. In 
the case of the hydrogen atom, however, we will consider functions ip of 
the form (18.1) where / and df /dr are bounded near the origin and have 
exponential decay near infinity. Proposition 9.35 then tells us that ip is in 
the domain of A. 
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18.2 The Hydrogen Atom: Preliminaries 


A hydrogen atom is formed out of a single electron that is “bound” to a 
proton by means of the electromagnetic attraction between the oppositely 
charged particles. The study of the hydrogen atom is a very important test 
case in quantum mechanics, and the ability of the Schrodinger equation to 
explain the observed energy levels of hydrogen was a crucial early success 
of the theory. 

A proton is approximately 1,800 times as massive as an electron. Thus, 
to first approximation, we may think of the location of the proton as being 
fixed, with the electron “orbiting” around this location. A more careful 
analysis considers both the proton and the electron as orbiting around 
their center of mass. The Hamiltonian for the relative position of the two 
particles is precisely that of a particle orbiting around a fixed center, except 
that the mass of the electron is replaced by the reduced mass p of the 
electron-proton system. (See Exercise 1.) Here, as in Proposition 2.16 in 
the classical case, 

m e m p 

V = - -7 -, 

m e + m p 

where m e and m p are the masses of the proton and electron, respectively. 
Since m p m e , the reduced mass is nearly the same as the mass of the 
electron. 

After separating out the motion of the center of mass, we are left with 
the following Hamiltonian for the relative position of the electron: 


H = 




(18.7) 


where Q is the charge of the electron. (We use a system of units, such 
as “electrostatic” or “Gaussian” units, in which the Coulomb constant is 
equal to 1.) It follows from Theorem 9.38 that H is self-adjoint on Dom(A) 
and that H is bounded below. 

Note that the classical Hamiltonian i7(x,p) for a hydrogen atom is not 
bounded below. After all, we can simply take p = 0 and take x very 
close to the origin. This unboundedness would cause strange behavior for 
a hypothetical classical hydrogen atom. After all, modeling a hydrogen 
atom using the 1/r potential is only an approximation. We are using an 
electrostatic formula for the force, the correct one when the positions of the 
particles are held fixed, in a dynamical situation. A more realistic model 
of hydrogen takes into account radiation, that is, the interaction of the 
charged electron with the electromagnetic fields. Classically, a negatively 
charge particle orbiting a positively charged nucleus would radiate, thus 
giving up energy to the electromagnetic fields. The classical particle would 
spiral rapidly toward the origin, with the particle’s energy going to — oo and 
the energy of the electromagnetic field going to + 00 . Thus, if hydrogen were 
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made up of classical charged particles, the electron would go into a “death 
spiral” and emit a giant burst of electromagnetic radiation. 

Fortunately for us, this is not how real particles behave! In actuality, the 
electron is a quantum particle. A quantum electron “orbiting” a proton can 
still give up energy to the electromagnetic field. The Hamiltonian for the 
quantum hydrogen atom, however, is bounded below, as a consequence of 
Theorem 9.38. Thus, the electron can only drop to its ground state (the 
state of lowest energy), at which point it becomes stable. 


18.3 The Bound States of the Hydrogen Atom 


Our goal in this section is to find the eigenvectors for the Hamiltonian H 
in (18.7) with negative eigenvalues. Such eigenvectors constitute “bound 
states,” that is, states in which the electron is bound to the proton. For 
each negative number E , we look at the eigenspace Ve for H with eigenvalue 
E, that is, the space of all if € Dom(A) satisfying Hip = Eip. Since H is 
self-adjoint and, therefore, closed, this eigenspace will be a closed subspace 
of T 2 (R 3 ). Since, also, H commutes with rotations, Ve will be invariant 
under the usual action (Definition 17.1) of SO(3) on L 2 (R 3 ). Thus, by 
the discussion at the end of Sect. 17.7, Ve decomposes as a direct sum of 
finite-dimensional, irreducible SO(3)-invariant subspaces. 

We now look for such subspaces of Ve- In the following theorem, we 
assume that the radial part of the wave function (the function / in the 
notation Vij in Definition 17.18) has a certain very special form. After 
analyzing this case, we argue that we have found in this way all of the 
eigenvectors for H with negative eigenvalues. 


Theorem 18.3 For each positive integer n, let 


pQ A 1 






2h 2 n 2 


(18.8) 


where Q is the charge of the electron and p is the reduced mass of the 
electron-proton system, and let 


Pn(x) 


\E n \ 


Then for each l = 0,1,..., n — 1, there exists a polynomial L n i such that 
for each homogeneous harmonic polynomial q of degree l, the function 


^(x)=g(x)e p " (x)/2 L ni ;(p n (x)) (18.9) 


satisfies 


Hip = E n ip. 
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It follows from Proposition 9.35 that the functions ip in (18.9) belong to 
Dom(A) and thus, by Theorem 9.38, to Dom(IP). The polynomials L Ut i are 
the Laguerre polynomials. The coefficient of —1 /n 2 in the formula (18.8) 
for E n is the Rydberg constant (compare Sect. 1.2.1). 

Let us see how to connect Theorem 18.3 to the usual expression for 
the hydrogen atom eigenvectors in the physics literature. In the first place, 
physicists choose a certain basis qi >m for the space of harmonic polynomials, 
which is—up to normalization constants—the basis in Theorem 17.4. In the 
second place, physicists write the solutions in spherical coordinates. When 
changing to spherical coordinates, we should keep in mind that qp m is 
homogeneous of degree l and that p n (x) is just a constant multiple of the 
distance from the origin. We obtain, then, the following expression: 

ipn,i,m(r, 9 , <p) = Yi :m (6, (p)p l n e~ Pn/2 L n j(p n ), (18.10) 


where Yi t7n (0,<p) is the restriction to the unit sphere of pi t m- 
Proof. If E is a negative real number, we look for solutions to Hip = Eip 
of the form q(x)/(|x|), where q £ Vi. Provided that f(r) and f'(r) are 
bounded near the origin, Proposition 9.35 allows us to compute A ip on 
R 3 \{0} without worrying about whether ip is differentiable at the origin. 
Using Proposition 18.1, the equation for / is 


& \d 2 f | 2(l + l) df 

2 p dr 2 r dr 


— f{r) = Ef(r). 
r 


(18.11) 


For large r, where the two terms that involve a factor of 1/r become neg¬ 
ligible, and so 

h 2 d 2 f 

Recalling that E is negative, (18.12) tells us that near infinity, / should 
behave like a combination of a growing and a decaying exponential. Since 
we want square-integrable solutions, we require that only the exponentially 
decaying term be present. 

We therefore postulate a solution of the form 


/(r) = exp 




9{r), 


(18.13) 


for some function g. If we plug (18.13) into (18.11) for /, there are canceling 
terms equal to Eg{r) on each side, leaving 

_ d 2 g ^2p\E\ dg | 2(1 + 1) dg 2(1 + 1) y/2p \E\ 

2 /i dr 2 h dr r dr r h 
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We now introduce the new variable p = (\/8p \E\/h)r. After making this 
change of variable, we find that each term in square brackets obtains a 
factor of 8p\E\/H 2 , so that our equation becomes 


h 2 8p\E\ 

2p h 2 


d 2 g 
dp 2 


dg_ 

dp 


2(1 + 1) dg (l + 1) 


p dp 


P 


g(p ) 


2y / 2 J\E\Q< 


9(P)- 


Multiplying through by p and simplifying yields the equation. 

Q 2 ^ 


d 2 g dg . dg 

PT^2 -/ , T+ 2 (‘ + + 

dp z dp dp 


h^2\E\ 


- (7 + 1) 


g(p) = 0. (18.14) 


If we postulate for g a power series Y^T=o a kP k , we obtain the following 
recurrence relations for the coefficients: 


[k + l + l-X] 

Uk+1 - ak k[(k + l)+2(1 + 1)} 


(18.15) 


where 

A _ Q 2 Vp 
n^/2\E\ 

The series for g will terminate, yielding a polynomial solution to (18.14), 
provided that A is an integer n with n > l + 1. We can then solve for the 
energy in terms of n as follows: 


\E\ 


pQ 4 

2n 2 h 2 


Recalling that E is negative, we have obtained the desired form for the 
energy levels. Furthermore, the condition n > 1 +1 is the same as l < n—1. 
Finally, if we plug in the formula for p in terms of r and the formula for / 
in terms of g, we obtain the form of the solution stated in the theorem. ■ 
It is important to emphasize that the functions in Theorem 18.3 do not 
span the entire Hilbert space L 2 (R 3 ). After all, these functions are all eigen¬ 
vectors for H with negative eigenvalues. If these vectors spanned L 2 (R 3 ), 
then the expectation value of the energy would always be negative. But it 
is easy to produce functions if> in the domain of H for which (?/’, Hip) > 0. 
Simply take ^ to be a Gaussian wave packet with mean position far from 
the origin and with very large mean momentum. Then (ijj,Vil>) will be 
close to zero but (ip, P 2 ip) will be large and positive. Nevertheless, it can 
be shown that the functions in Theorem 18.3 span the negative energy 
subspace of L 2 (R 3 ). It is possible to analyze also the positive part of the 
spectrum of H , but the spectrum above zero is purely continuous and rep¬ 
resents a hydrogen atom that has ionized, that is, in which the electron has 
escaped from the proton. 
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Theorem 18.4 As n varies over all positive integers, l varies from, 0 to 
n — 1, and g varies over all homogeneous harmonic polynomials of degree 
l , the eigenvectors in Theorem 18.3 span the negative-energy subspace of 
£ 2 (R 3 ), that is, the range of the projection ((—oo, 0)), where p H is the 
projection-valued measure associated to H by the spectral theorem. 

Proof. The proof requires results from spectral theory that go beyond the 
machinery that we have developed in Chaps. 9 and 10, and which we cannot 
reproduce in full here. Specifically, we make use of Theorem V.5.7 of [27], 
which tells us that the negative-energy portion of the spectrum of H is 
discrete, consisting of eigenvalues of finite multiplicity accumulating only 
at zero. 

We indicate briefly why the above result holds. If A and B are unbounded 
self-adjoint operators, let us say that B is a relatively compact perturbation 
of A if A(B — A /) -1 is a compact operator for every A in the resolvent set 
of B. According to Lemma V.5.8 of [27], the potential energy operator 
for the hydrogen atom is a relatively compact perturbation of the kinetic 
energy operator. This is a strengthening of what we showed in the proof 
of Theorem 9.38, namely that the potential energy operator is relatively 
bounded with respect to the kinetic energy operator, with relative bound 
less than 1. The proof of relative compactness relies on the fact that the 
potential for the hydrogen atom goes to zero at infinity. 

Meanwhile, let us say that A belongs to the essential spectrum of an un¬ 
bounded self-adjoint operator A if either A is a nonisolated point in cr(A) 
or A is an eigenvalue for A with infinite multiplicity. According to The¬ 
orem IV.5.35 of [27], a relatively compact perturbation of a self-adjoint 
operator does not change the essential spectrum. Thus, the essential spec¬ 
trum of H is equal to the essential spectrum of the kinetic energy operator, 
which is certainly contained in [ 0 ,oo), since the kinetic energy operator is 
non-negative. It follows that any point in the negative-energy part of the 
spectrum of H must be an isolated point in cr(H) and an eigenvalue of 
finite multiplicity. 

In light of the preceding result, there is no continuous spectrum for H 
below zero, and we need only look for square-integrable eigenvectors. Since, 
also, each eigenspace for H with eigenvalue E < 0 is finite dimensional, it 
will decompose as a direct sum of irreducible, SO(3)-invariant subspaces. 
Such subspaces, according to Proposition 17.19, are always of the form Vij 
for some l and /, where Vij is as in Definition 17.18. Thus, we look for 
functions 0 of the form 0(x) = p(x)/(|x|) such that Hxf = £70 for some 
E < 0. 

Now, if a function of the form p(x)/(|x|) is to be an eigenfunction of 
the Hamiltonian, / must satisfy the differential equation (18.11). By ele¬ 
mentary results from the theory of linear ordinary differential equations, 
this equation has precisely two linearly independent solutions, for any value 
of E. Both solutions can be constructed by postulating a solution of the 
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form (18.13), introducing the new variable p, and then using a power series 
expansion for g{p) (Exercise 9). One of the solutions for g{p) will have a 
power series starting with p~( 2l+1 \ in which case V’( x ) will blow up like 
1/ |x| (i+1) near the origin; such a function is not in the domain of the Hamil¬ 
tonian (Exercise 14 in Chap. 9). The other solution for g{p) will start with 
p° and may be obtained by using the form (18.13), changing from the vari¬ 
able r to the variable p, and then using the recurrence relation (18.15) to 
define the coefficients of a power series. If the resulting series does not ter¬ 
minate, it is not hard to see that the terms will behave for large k like the 
series for e p . Since the function / is equal to e~ p ' 2 g{p), this function will 
grow like e p / 2 near infinity, which means that ip will not be in L 2 (R 3 ). Thus, 
to get a square-integrable solution, the series for g{p) must terminate, in 
which case ip is one of the functions in Theorem 18.3. ■ 

Corollary 18.5 Each eigenvalue E n , as given in Theorem 18.3, has mul¬ 
tiplicity n 2 . 

Proof. According to Theorem 18.4, the eigenvectors in Theorem 18.3 con¬ 
stitute all of the eigenvectors for H with eigenvalue E n . The number of 
independent eigenvectors with eigenvalue E n is thus the sum of the dimen¬ 
sions of the spaces V) of spherical harmonics, with ? = 0,l,...,n — 1. This 
number is, by Theorem 17.12, 

n— 1 

]T(2Z + l) = n 2 , 

1=0 


as claimed. ■ 


18.4 The Runge-Lenz Vector in the Quantum 
Kepler Problem 

In Sect. 2.6, we showed that the classical Kepler problem can be solved 
almost completely by making use of the Runge-Lenz vector, which is a con¬ 
served quantity. The quantum version of the Runge-Lenz vector commutes 
with the Hamiltonian and can elucidate a number of special properties of 
the quantum Kepler problem, which we typically think of as describing a 
hydrogen atom. In particular, the Runge-Lenz vector will help to explain 
(1) the simple form — R/n 2 of the negative energies of the hydrogen atom 
and (2) the apparent coincidence by which energy of the states in (18.9) 
is independent of l for a given n. Note that the rotational symmetry of 
the problem explains why the energy of the states in (18.9) is indepen¬ 
dent of the choice of the harmonic polynomial q. Nevertheless, rotational 
symmetry cannot explain why states for different values of l —and thus dif¬ 
ferent radial dependence in the wave function -have the same energy. This 
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apparent coincidence will be explained by an additional symmetry of the 
problem, that is expressible in terms of the Runge-Lenz vector. See also 
Sect. 7 of [17] for a somewhat different (but related) explanation for the 
structure of the eigenvalues of the hydrogen atom and their multiplicities. 

There are several computations involving the Runge-Lenz vector that, 
while elementary, are laborious. Those computations are deferred to 
Sect. 18.6. 


18.4-1 Some Notation 

To keep the notation as simple as possible, we will adopt in this section 
Einstein’s summation convention , which states that repeated indices are 
always summed on, even if there is no summation sign written. In this 
section, the sum will always range from 1 to 3. Using this convention, we 
write, say, the dot product of two vectors u, v in l 3 as u • v = u j v j> where 
the summation convention frees us from having to write out explicitly the 
sum over j. 

We will make frequent use of the totally antisymmetric symbol Ejki , where 
j, k , and l range from 1 to 3, defined as follows, 

Definition 18.6 For j,k,l € {1,2,3}, define Ejki by the formula 

{ 1 if (j , k , l) is an even permutation of (1, 2, 3) 

— 1 if ( j , k, l ) is an odd permutation of (1, 2, 3) 

0 if any two ofj,k,l are equal 

Thus, for example, £321 = — 1 and £212 = 0. The commutation relations 
for the basis {Fi, F 2 , F 3 } for so(3) may be written (using the summation 
convention!) as 

[Fj, Fk] = EjkiFi- (18.16) 

For instance, if we take j = 1 and k = 2 in (18.16), then the sum on l gives 
a nonzero value only when l — 3, and we recover the relation (Fj, Ffi = F$. 


18.4- 2 The Classical Runge-Lenz Vector, Revisited 

We have already introduced, in Sect. 2. 6 , the Runge-Lenz vector A in the 
classical mechanics of a particle moving in a 1/r potential. We require a few 
more properties of A before turning to the quantum version. We consider 
a classical particle in R 3 with Hamiltonian given by 


H(x, p) 


]P 

2g |x| ' 


(18.17) 


This is just the Hamiltonian for the classical Kepler problem, except that 
we replace the mass m of the planet by the reduced mass g of the electron- 
proton system, and we replace the constant k := mMG by Q 2 . 
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For the Hamiltonian in (18.17), the Runge-Lenz vector is given by the 
formula 

A(x, p) = “712 P x J “ i —r 

Kr |x| 

where J := x x p is the angular momentum. By Proposition 2.34, the 
Runge-Lenz vector is a conserved quantity for the classical Kepler prob¬ 
lem, in addition to H and J, which are conserved quantities for any radial 
potential. By results of Sect. 2.6, we have the following relations among 
these conserved quantities: 


A • J = 0 


|A| = 1 


2H 




Lemma 18.7 The Runge-Lenz vector A and the Hamiltonian H in (18.17) 
satisfy the following Poisson bracket relations: 


{Aj,H} = 0 

{Aj, Am} = — — Q^Sjml JlH. (18.18) 

We have already shown that the Runge-Lenz vector is a conserved quan¬ 
tity (Proposition 2.34), which is equivalent (Proposition 2.25) to saying that 
the Poisson bracket of Aj with H is zero, as claimed. The proof of (18.18) 
is deferred to Sect. 18.6. We now introduce certain combinations of the 
Runge-Lenz vector, the angular momentum, and the Hamiltonian that 
form a Lie algebra under the Poisson bracket. In the construction of these 
functions, we need to take a square root of the Hamiltonian, which necessi¬ 
tates separating the positive-energy and negative-energy parts of the phase 
space. Our interest is primarily in the negative-energy case. 

Definition 18.8 Let U~ denote the negative-energy part of the classical 
phase space, 

u ~ = {(x, p) e k 6 | il(x, p) < o}. 

Consider on U~ the normalized Runge-Lenz vector B given by 


B = 



A. 


Define also vector-valued functions I and K on U by 


J + B 


K = 


J — B 


2 


2 
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Theorem 18.9 The functions I and K Poisson-commute with the Hamil¬ 
tonian and satisfy the following Poisson-bracket relations on the negative- 
energy set U~: 


{ lj • If- } ^jkl11 

{Kj,Kk} = SjuKi 

{Ij,K k } = 0. 


The functions I and K also satisfy the following algebraic relations: 


|I| 2 = |K| 2 


mQ 4 
8 Iff I' 


In Theorem 18.9, we use the summation convention introduced in the 
previous subsection. The proof of this theorem is elementary but rather 
laborious, and is deferred to Sect. 18.6. 

The span of the functions /i, RiF and Ki,K 2 ,K 3 on U~, which is 
the same as the span of the functions B\,B 2 ,B 3 and Ji, J 2 ,J 3 , forms a 
6 -dimensional Lie algebra under the Poisson bracket. Comparing the Poisson- 
bracket relations among the F s and among the K’ s to the relations among 
the basis elements F\,F 2 , F 3 for so(3), we see that the span of the F s and 
the span of the K’s are both isomorphic to so(3) [or, if you prefer, to su(2)]. 
Since also each Ij commutes with each , the 6-dinrensional Lie algebra 
spanned by the Fs and the K' s is isomorphic to so(3) ® so(3). Meanwhile, 
as demonstrated in Exercise 4, so(3)©so(3) is isomorphic to the Lie algebra 
so(4). Since all the Fs and K's Poisson-commute with the Hamiltonian, we 
say that the Kepler problem has so(4) symmetry. This is in contrast to the 
dynamics of a particle moving in R 3 in the force generated by a typical 
radial potential, which has only so (3) symmetry. 

To be more precise, “so(4) symmetry” prevails only on the negative- 
energy subset U~ of the classical phase space. On the positive-energy subset 
U + , the span of the functions B\,B 2 ,B 3 and Ji, J 2 , J 3 again forms a 6- 
dimensional Lie algebra. This Lie algebra, however, is not isomorphic to 
so(4), but rather to so(3,1), where so(3,1) is the Lie algebra of the group of 
4x4 matrices that preserve the quadratic form xf+x 2 +x§ — x\. The reason 
the formulas on U + are different from those on U~ is that calculations of 
the relevant Poisson brackets involves the function H/ \H \, which has the 
value 1 on U + and the value —1 on U~. (The factor of H comes from 
Lemma 18.7 and the factor of |iL| from the factor of \J\H\ in the definition 
of B.) 


18-4-3 The Quantum Runge-Lenz Vector 

We now introduce the quantum counterpart A of the classical Runge-Lenz 
vector A. The quantum Runge-Lenz satisfies most of the same properties 
as the classical version, with a few small but crucial “quantum corrections.” 
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Definition 18.10 Define the quantum Runge-Lenz vector by 


A = 


1 1 
rQ 1 2 


(P x J - J x P) 


X 

|xT 


Note that in the quantum case, — J x P is not the same as P x j, because of 
the noncommutativity of the factors. The particular combination of P x J 
and J x P in Definition 18.10 is used because it is yields a self-adjoint 
operator. The Runge-Lenz vector can also be computed as 

as will be verified in Sect. 18.6. 

In the interests of keeping the exposition manageable, we will not concern 
ourselves in what follows with determining the precise domains on which 
various identities hold. 

Proposition 18.11 The quantum Runge-Lenz vector A satisfies the fol¬ 
lowing relations: 


A • J = J • A = 0 

A ■ A = 1 + f J • J + n 2 V (18.20) 

gQ* V J 

Note that there is a “quantum correction” in (18.20); the factor of J • J 
in the classical expression for A ■ A is replaced by J • J + h 2 . This correction 
gives rise to a quantum correction in (18.22), which in turn is essential 
to getting the correct value for the energy eigenvalues in Corollary 18.17. 
The proof of this result and the other results of this section are deferred to 
Sect. 18.6. 

Lemma 18.12 The quantum Runge-Lenz vector A and the Hamiltonian 
H satisfy the following commutation relations: 

I[4A] = o ^ 

— [Aj,A m ] = (18.21) 

Note that since H commutes with rotations, it commutes with the angu¬ 
lar momentum operators J;. Thus, in (18.21), we could just as well write 
HJi in place of JiH. As in the classical case, if we normalize the com¬ 
ponents of the Runge-Lenz vector by dividing by the square root of the 
Hamiltonian, then these operators together with the angular momentum 
operators form a 6 -dimensional Lie algebra. 
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Definition 18.13 LetV denote the negative-energy subspace o/L 2 (R 3 ), 
that is, the range of the spectral projection oo,0)). Let \H\ denote 

the restriction to V~ of the operator —H. On V~, define operators B by 


hQ 2 


A. 


Define also operators I and K, as in the classical case, by 


I = 


J + B 


K = 


J-B 


It is possible to define the absolute value of any self-adjoint operator 
by means of the functional calculus. However, since the restriction of H 
to V~ is, by definition, negative definite, the restriction of \H\ to V~ co¬ 
incides with the restriction to V~ of —H. The operator \/\J\H\ is the 
operator with a restriction to the energy eigenspace with eigenvalue E n 
that is 1/ ^/\E n \I. The components of B are unbounded operators, defined 
on suitable dense subspaces of the Hilbert space V~. 

Theorem 18.14 The operators I and K commute with the Hamiltonian 
H and satisfy the following commutation relations: 

= £ jklh 

., b j . A /,] — SjkiKi 

1ft,A*] =0. 

These operators also satisfy the following algebraic relations: 

pQ A h 2 


I • I = K ■ K = 


m 


(18.22) 


18.4-4 Representations of so(4) 

In light of the commutation relations in Theorem 18.14, we can define a 
representation n of the Lie algebra so(4) = so(3) © so(3) on the negative- 
energy subspace V~ as follows: 

*(Fj, 0) = 7r(0, Fj) = y< r (18.23) 

It is therefore desirable to classify the irreducible finite-dimensional repre¬ 
sentations of so(3) © so(3), which we do in the following proposition. 








18.4 The Runge-Lenz Vector in the Quantum Kepler Problem 407 


Proposition 18.15 Suppose 14 cmd V) are irreducible representations of 
so(3) of dimensions 2k+l and2l+l, respectively. ThenVk®Vi is irreducible 
when viewed as a representation of so(3) ©so(3) as in Remark 16. f 9. Fur¬ 
thermore, every irreducible finite-dimensional representation of so(3)©so(3) 
is isomorphic to 14 ®Vi for a unique ordered pair ( k , l). 

For any representation 14 0 V) of so(3)©so(3), define Casimir operators 
Ci and C 2 by the formula 


Ci = c 2 = Y / I ®M F i) 2 - 

i=1 i=i 


Then we have 


Ci = -k(k + 1 )/; C 2 = -1(1 + 1)1. 


Proof. To classify the irreducible representations of so(3)©so(3), we could 
appeal to the general theory of representations of direct sums of Lie alge¬ 
bras. It is not hard, however, to give a direct proof using the same sort 
of reasoning we used in the classifications of irreducible representations 
of so(3). We will omit the details of this computation. The result on the 
Casimir operators follows easily from Proposition 17.8. ■ 

In any finite-dimensional subspace of V~ that is invariant and irreducible 
under the action of so(3)©so(3) in (18.23), the Casimir operators are given 
by Ci = -M /h 2 and C 2 = -K-K/ti 2 . Since, by Theorem 18.14, LI = K K 
on V ~, all of the irreducible representations of so(3)©so(3) that arise inside 
V~ will be of the form 14 <8> 14- 

Theorem 18.16 LetW ^ denote the eigenspace for the Hamiltonian with 
eigenvalue E n . Then W^ is invariant and irreducible under the action of 
so(3) © so(3) in (18.23). More specifically, we have the isomorphism 


W ("> “ 14 ® 14, 


as representations of so(3) © so(3), where k = (n — l)/2 and where 14 is 
the irreducible representation of so(3) of dimension 2k + 1 = n. 

Corollary 18.17 If n, k , and W^ n l are as in Theorem 18.16, then for all 
if G WW, we have 

i • tip = j • Jip = h 2 k(k + 1 ). 

Using (18.22), the eigenvalue E n of H on can be solved for as 

F = tQ a = tQ 2 

n 8h 2 (k + |) 2 2h 2 n 2 ' 

The expression for E n in Corollary 18.17 is the same as in Theorem 18.3. 
The remarkable thing about the proof of Theorem 18.17 is that it is purely 
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algebraic, relying only on the commutation relations among the operators 
I k and Ki, along with the relationship (18.22) between the Hamiltonian 
operator H and the Ik’s and K{ s. 

Proof of Corollary 18.17. It is easily seen that the operators I • I and 
K-K, when restricted to an irreducible subspace for the action of so(3) © 
so(3), are equal to — h 2 C\ and —H 2 C 2 , where C\ and C 2 are the Casimir 
operators appearing in Proposition 18.15. Thus, if is isomorphic to 
14®14, with k = ( n — l)/2, then ft and K-K will be equal to h 2 k{k + l)I, 
as claimed. On the other hand, IT and K-K are related to the Hamiltonian 
H by (18.22), from which we can solve for E n . m 

Proof of Theorem 18.16. Since each component of A and J commutes 
with H , each component of I and K will also commute with H . Each 
eigenspace of H is therefore invariant under the action of I and K. Since 
the I’s and K’s are self-adjoint and is finite dimensional, will 

decompose as a direct sum of irreducible invariant subspaces. By Proposi¬ 
tion 18.15, these irreducible subspaces will be of the form 14 0 V), where 
14 and Vi are irreducible representations of so(3) of dimension 2 k + 1 and 
21+1, respectively. But now, the operators I • I and K-K, when restricted 
to one of the irreducible subspaces of W( n \ are equal to —H 2 Ci and —h 2 C 2 , 
where C\ and C 2 are the Casimir operators appearing in Proposition 18.15. 
Since I I = K Kon all of V~, the eigenvalues of C\ and C 2 must be equal 
on each irreducible subspace of W^ n \ Thus, we must have k = l, meaning 
that only irreducible subspaces of the form 14 © 14 arise. 

Now, under the isomorphism of some irreducible subspace of W with 
14 0 Ifc, the operators I k and K k act as ihF k 0 1 and ihl 0 F k , respectively, 
where the F k ’s are the usual basis for so(3). Since J = I + K, each Jk acts 
as ih{Fk 0 I + I 0 Fk). This means that 14 0 14, under the action of the 
Jk s, can be thought of as a tensor product of two representations of so(3), 
viewed as another representation of so(3) as in Definition 16.48. Viewed 
this way, 14 0 14 decomposes as in Proposition 17.23 as 

V k ®V k = V 0 @Vi®---®V 2 k- (18.24) 

On the other hand, we know from Theorem 18.3 that decomposes 

under the action of so(3) as 


Vo © 4i © • • • © 14-1* (18.25) 

Thus, the space of the form 14 0 14 must be all of ; if there were 
another term then the trivial representation Vo would occur more than 
once in . This being the case, matching the decompositions (18.24) 
and (18.25) requires that 2k = n — 1, as claimed in the theorem. ■ 

The proof of Theorem 18.16 relies to some extent on the results of 
Sect. 18.3. Using only algebraic manipulations involving the Runge-Lenz 
vector, however, we could still argue that the eigenvalues of H must be of 
the form given in Corollary 18.17. We would not, however, know that for 
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every positive integer ro, the number E n is actually an eigenvalue for H. 
We would also not know that each eigenspace W^ is irreducible under the 
action of so(4); conceivably, based only on the algebra, W^ could have, 
say, dimension 2 n 2 instead of n 2 . 


18.5 The Role of Spin 

The spin of the electron is 1/2. As discussed in Sect. 17.8, this means 
that the Hilbert space for an electron is L 2 (K 3 )®Vi/ 2) where Vj / 2 is a 
2-dimensional vector space that carries an irreducible projective unitary 
representation of SO (3). Up to now, we have neglected the spin in our 
calculations. The reason for this omission is simple: to first approximation, 
the spin plays no role in the calculation. Specifically, in the simplest model 
of a hydrogen atom with spin, the Hamiltonian is simply H ® /, where H 
is the operator in (18.7), acting on L 2 (R 3 ). For any n > 0, we can obtain a 
basis of eigenvectors for H ® I with eigenvalue E n by taking vectors of the 
form ip ni i im <8> ej, where the ipn,l,m’s are as in (18.10) and where {ei,e 2 } 
forms a basis for V)/ 2 . 

Now, from the point of view of rotational symmetry, the basis ipn,i,m®ej 
is not the most natural one. Rather, we should decompose the eigenspaces 
into irreducible invariant subspaces for the (projective) action of SO(3), 
where SO(3) acts on both L 2 (M 3 ) and Vi/ 2 . We have already decomposed 
the eigenspaces inside L 2 (R 3 ) into irreducible invariant subspaces, namely 
the span of il) n ,i,m where n and l are fixed and m varies. Thus, to obtain 
the irreducible invariant subspaces inside L 2 (R 3 )®Ui/ 2, we use the method 
of “addition of angular momentum” from Sect. 17.9. According to Proposi¬ 
tion 17.22, V t (dV \/2 is irreducible if l = 0 and isomorphic to Vj+i /2 © V 2 - 1/2 
if l > 0. Consider, for example, the case n = 3, l = 1, the so-called “3 p 
states” in traditional chemistry terminology. Since V\ ® ly/ 2 decomposes 
as R/2 © R/2 j when we take spin into account, we obtain a 4-dimensional 
space and a 2-dimensional space. We can obtain bases for these spaces by 
tracing through the proof of Proposition 17.22. 

The decomposition described in the previous paragraph is essential when 
considering the “fine structure” of hydrogen. Our model of hydrogen using 
the Hamiltonian (18.7) is only a first approximation. More realistic mod¬ 
els take into account various corrections, including radiative corrections, a 
finite size for the nucleus, and “spin-orbit coupling,” among other things. 
The notion of spin-orbit coupling adds a term into the Hamiltonian involv¬ 
ing the operator J • ct, where ay, a- 2 , and 0-3 are the operators describing 
the action of so(3) on Vi/ 2 . When this term is included, the Hamiltonian 
is no longer of the form A® I for some operator A on L 2 (R 3 ). Thus, we 
can no longer simply append the spin to the end of the computation, but 
must take it into account from the beginning. 
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The various corrections to the Hamiltonian for the hydrogen atom have 
the effect of reducing the multiplicities of the eigenvalues. Almost any cor¬ 
rection we make, for example, will destroy the independence of the eigen¬ 
value on l for a given n, simply because the correction terms in the Hamilto¬ 
nian will not commute with the quantum Runge-Lenz vector. Nevertheless, 
all of the corrections that make up the fine structure of hydrogen preserve 
the rotational symmetry of the problem. Thus, the same irreducible repre¬ 
sentations of SO (3) that we had in the simple model will appear after the 
corrections are made. For n = 2, l = 1, for example, we will still have a 
4-dimensional space and 2-dimensional space, but these two spaces will no 
longer have the same energy. 

18.6 Runge-Lenz Calculations 

In this section, we fill in many of the computations that we passed over 
without proof in Sect. 18.4. Although all the calculations are, in principle, 
elementary, there are a number of nonobvious tricks that help simplify 
the algebra. We will make frequent use of the concepts of functions that 
transform like vectors (on the classical side) and of vector operators (on 
the quantum side), including Propositions 17.25 and 17.27 (Sect. 17.10). 
In particular, we note that the position x, the momentum p, the angular 
momentum j, and the Runge-Lenz vector A all transform like vectors, 
and that the corresponding quantum quantities are all vector operators. 
(Compare Exercise 7.) In the “e” notation of Sect. 18.4.1, Proposition 17.27 
takes the form 



(18.26) 


In the quantum mechanical calculations, there are a number of “quantum 
corrections,” in which dot products and cross products of vector operators 
do not behave as they do in the classical case. 

Lemma 18.18 The e-function in Definition 18.6 satisfies the relations 



The proof of these results is not difficult and is left to the reader (Ex¬ 
ercise 6). The following identities involving the cross product of vector 
operators will be useful to us. 

Lemma 18.19 If C, D, and E are arbitrary vector operators, we have 


C • (D x E) = (C x D) • E 
CxD + DxC = £jki \C k , Di\ 

C x C =^e jkl [C k ,Ci]. 


(18.27) 

(18.28) 


(18.29) 


18.6 Runge-Lenz Calculations 411 


In particular, if the different components of C commute, then C x C = 0. 
Finally, 


(C x (D x E))j = C k DjE k - CkDkEj. 


(18.30) 


As special cases of these results, we have 


(18.31) 

(18.32) 


JxP + Px J = 2?'fiP 
J x J = iHJ 


Note that if the entries of D and E commute, then the right-hand side 
of (18.30) reduces to the classical expression, (C ■ E)D — (C • D)E. Us¬ 
ing (18.31), we can easily verify the alternative expression (18.19) for the 
Runge-Lenz vector. 

Proof. The right-hand side of (18.27) is computed as EjkiChDiEj. If we 
note that £jki = £klj and then relabel the indices, we obtain SjkiCjDkEi, 
which is equal to the left-hand side of (18.27). For (18.28), we compute 
that 

(C x D + D x C )j = SjkiCkDi + SjuDkCi 

= e 3 kiC k D l +£jkiCiD k — Sjki[Ci, D k ). (18.33) 

If we note that £jki = —£jik and then relabel the indices k and l, we see 
that £jkiCiDk = — EjkiCkDi , so that the first two terms in the second line 
of (18.33) cancel. The remaining term can be put into the claimed form by 
relabeling the indices k and l. The identity (18.29) is just the D = C case 
of (18.28). Finally, (18.30) follows easily from Lemma 18.18. 

To obtain (18.31) and (18.32), we apply (18.28) and (18.29), respectively. 
Since both J and P are vector operators, the desired result follows easily 
from Lemma 18.18. ■ 

We now turn to the proofs of the results of Sect. 18.4. We prove only the 
quantum versions of the results, since the classical results are extremely 
similar, except that certain quantum corrections can be ignored. 

Proof of Lemma 18.12, First Part. We begin by showing that Aj 
commutes with H for each j. Since H commutes with J, we have 


[Aj,H ] = (e jkl [P k ,£f\Jt - J k [Pi,H}) ~ 


Meanwhile, since the P’s commute among themselves, we have 
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Thus, 


^jki [P k 5 P\ Ji — ihQ -jki -i 


X k 

ikL^lmn ~ ~o X m P n 

Xr 


X k 


— ihQ (SjmSkn ^km ) nX m P n 

|X| 

= —ihQ 2 - — m(X n XjP n — X m X m Pj) 

\x \ 6 

= 3 (**( x ■ P) - (X • X)P,). 


(18.34) 


We compute £jkiJ k [Pi, H] in a similar way. Note that J k = £ k mnX m P n = 
£ k .rnnPnX rn . since X m and P n commute except when m = n, in which case 
£kmn = 0. The result is 

e jkl J k [P h H] = -*ft(P,-(X • X) - (P • 

|X| 

Meanwhile, since the X’s commute among themselves, we have 


ft 

\xv 


Xj P 2 

jxf’2^. 

= ife p 

2 m[|X|’ 


Pk + 7^—Pk 

2/i 


i— ( 

2 Ji l lX|' jk ~ 


XkX k 



Pk 


p 

|X| ’ fc 

-Pk 

2p 


3 (x-P)l+S 


2 pi 





(P-X) 



(18.35) 


It is now a simple matter to compute [Aj,H] by combining (18.34) and 
(18.35) and verify that everything cancels. We have, for example, a term 
involving (X,/|X| 3 )(X • P) in (18.34) and a canceling term in (18.35). ■ 
Before proceeding with the remaining results concerning the Runge-Lenz 
vector, we verify some results that will be needed later. There are some 
quantum corrections compared to the corresponding classical results. 


Lemma 18.20 As in the classical case, the following 
lations among vector operators hold: 

“orthogonality” re- 

j.p = p. J = 0 

(18.36) 

J-X = X-J = 0 

(18.37) 

(P x J) • J = J • (P x J) = 0. 

(18.38) 
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Meanwhile, there is a quantum correction in the dot product between P and 
P x J, as follows: 

P • (P x J) = 0 (18.39) 

(P x J) ■ P = 2*fi(P • P). (18.40) 

Finally, we have 

(Px J)-(Px J) = (P-P)(J-J) (18.41) 

X (P x J) = J J (18.42) 

(P x J) X = J J + 2ihP ■ X. (18.43) 

Proof. By (18.27) and (18.29), we have 

j-P= (XxP)-P = X-(PxP) = 0, 


since the different components of P commute. The same reasoning shows 
that P • J, J • X, and X • J are all zero. To compute (P x J) • J, we first 
use (18.27), then use (18.32), and then use that P J = 0. For J • (P x J), 
we rewrite P x J in terms of J x P, using (18.31). The correction term 
involves P, which has a dot product of zero with J, and so the answer is 
again zero. 

We use (18.27) and (18.29) again to establish (18.39). To get (18.40), we 
first rewrite P x J in terms of J x P using (18.31) and then apply (18.39). 
To establish (18.41), we apply (18.27) and then (18.30), giving 

(P x J) • (P x J) = PjJkPjJk - PjJkPkJj • (18-44) 

The second term on the right-hand side of (18.44) is zero because J • P = 0. 
For the first term, we move Jk to the right past Pj. This generates the term 
we want plus a correction term equal to ihekjiPjPiJk ■ The correction term is 
zero because Pj and Pi commute and ekji is changes sign under interchange 
of j and l. The identity (18.42) follows immediately from (18.27) and the 
definition of J. The identity (18.43) follows from (18.27) and (18.28). ■ 


Lemma 18.21 For all j and m , we have 

[(P X J )j, (P X J) m ] = -ih{ P • P 

Proof. In computing [PkJi, P n J 0 \, we use repeatedly the product rule for 
commutators (Point 3 of Proposition 3.15). We obtain four terms, one of 
which is zero (the term involving [Pk,P n ])■ We use Proposition 17.27 (in 
the form (18.26)) to evaluate all remaining terms, giving 


; £ jklPkJl: ^mnoPndo\ 


£ jkl £mno }Jo + PnPk[Ju Jo] + Pn[Pk , Jopl ) ■ (18-45) 
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Let us compute the first of the three terms on the right-hand side of (18.45). 
Using Lemma 18.18 and the fact that P is a vector operator, we get 

£jkl£mnoPk\Jli Pn\Jo — ^jkl^op&ml &ol&mp)PkPpJo 
— £ jkmPkPpJp EjkoPkPmJo 

= EjkmPkiP ' J) - Pm (P X 3)j. 

If we compute the second and third terms similarly, we obtain 

— [SjklPk Jl i £mno PnJo\ — £jkmPk (P * J) Pm {P X J)j 

+ (P X P )jJ m ~ £jkmPk( P ' J) + Pm{ P x 3)j — (P • P )£j m lJl- 

Three of the above terms are zero (those involving P • J or P x P) and two 
other terms cancel, leaving us with 

7 [UX/ Pk Jl: £mnoPnJo\ — (P * jmlJli 


as claimed. ■ 

We now continue with the proof of the properties of the Runge-Lenz 
vector. 

Proof Proposition 18.11. From the first set of orthogonality relations in 
Lemma 18.20, we can see easily that J • A = A ■ J = 0. Meanwhile, using 
the expression (18.19) for A and expanding out A ■ A yields, after a little 
simplification, 


A A = 1 + ^ (P p) ( J J+ » 2 ) 


1 

HQ 2 


2 JJ ixi + ih 


X 

lx| 



Now, 


X „ „ X / 8 kk 

— ■ P - P • — = ih\ Tvvl- 


Xk x k 

ilcFm 


= 2ih m■ 


Thus, 


A • A = 1 


J ■ J) + h‘ 


mQ 4 


(P'P) 

2 /.i 


- <3 T^T 


as claimed. ■ 

Proof of Lemma 18.12, Second Part. We write A in the form given 
in (18.19). In computing the commutator of Aj with A m , we get several 
different types of terms, which we compute one at a time. Of course, the 
commutator of Xj / |X| with X m / |X| is zero. The commutator of the P x J 
terms has been computed in Lemma 18.21. 
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Meanwhile, to compute the commutator of PkJi with X m (l/|X|), we 
again get four terms and, again, one of these is zero, namely the one in¬ 
volving { Ji, 1/ |X|}, since 1/ |X| is invariant under rotations. We have, then, 


1 

ih 


Ejkl Pk Jl ^ X m | | 

~;}kl \Pki X rn \Jl T £jklPk\Jli X rn ] —— ~t" -jkl X n 


Pfc ’ IXI 


Jl 


f 1 1 x k 

£jkl^km^l 1-^-1 “1“ £jkl^lmnPk-^-n i-y| ^ o* 

W X 


If we apply Lemma 18.18 and carry out some computations similar to ones 
we have already performed, we obtain 


1 

ih 


&jklPkJli X m 



£jml Jl 


1 

lx| 


+ <5 iro (P-X) 


i 

pq 


+ X m Xj 3 (X • P) 




(18.46) 


In a commutator of the form [ay + pj , a m + /3 m ], the terms involving the 
commutator of an a with a /3 will be [ ctj , /3 m ] + \(3j , a m ] , which is equal 
to [aj,/3 m ] — [a m ,/3j]. This quantity is skew-symmetric j with to, meaning 
that it changes sign when we interchange j with to. Thus, terms in (18.46) 
that are symmetric in j and to will disappear when we compute the full 
commutator of Aj with A m . Thus, the second and third terms in (18.46) 
can be ignored. In the last term, we can commute P m past Xj to obtain 


P *L 

m |X| 


X, 


p- = ^p 

xi 3 m n 



XjX„ 


(18.47) 


which is also symmetric. Thus, only the first term in (18.46) contributes to 
the computation of [Aj, A m ], This term is skew-symmetric in j and to and 
will be doubled when we compute [Aj , A m ]. 

Now, it is straightforward to compute [ejkiPk Ji, Pm] and [P,,X m /|X|] 
and to verify that these commutators are symmetric in j and to (Exercise 8) 
and therefore do not contribute to the computation of [Aj. A rn }. We are left, 
then, with the following 

rn [Aj,Am] = ~lPQi £jml{p ' p) ^ + ^ 2ejmlJl ]xf 

2 9 f PP Q 2 \ 
nQ^ mlJl \ |X|y 

which is what is claimed in the lemma. ■ 
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Proof of Theorem 18.14. Since the Hamiltonian H is invariant under 
rotations, H commutes with each component of the angular momentum. 
We have also established that H commutes with each component of the 
Runge-Lenz vector. From this it follows easily that I and K commute with 
the Hamiltonian. 

Since A k commutes with H , it also commutes with any function of H . 
It then follows from Lemma 18.12 that 

= ^-[A kl Ai\ = 

ih L ’ J 2\H\ [ J 2\H\ liQ 4 3 

Since H/\H\ = —I on the negative-energy subspace V~, the above expres¬ 
sion reduces to (The result on the positive-energy subspace will 

differ by a crucial minus sign from what we have on V ~.) 

Meanwhile, since both B and J are vector operators, we have, by Propo¬ 
sition 17.27, (1 /{ih))[Bj,J k \ = EjkiBi and (1 /(iH))[Jj,J k \ = SjkiJi ■ From 
the commutation relations among the Bj ’s and Jj ’s, it is an easy calcula¬ 
tion to verify the claimed commutation relations among the components of 
I and K. ■ 


18.7 Exercises 


1. Consider the quantum Hamiltonian for two particles in R 3 interacting 
by means of a 1/r potential: 

f T = __ h L A Q 2 

2mi 1 2m 2 2 Ixi-x 2 !' 

Here, as in Sect. 3.11, A- ; is the Laplacian with respect to the variable 
x 1 and A 2 is the Laplacian with respect to the variable x 2 . As in 
Sect. 2.3.3, introduce new variables consisting of the center of mass, 
c = (mix 1 -|-m2X 2 )/(TOi-|-?7i2), and the relative position, y = x 1 —x 2 . 

Show that 77 2 can be expressed in these variables as 


h 2 

2(toi + m 2 ) 


A c 



|y| ’ 


where ^ is the reduced mass, given by /i = m\mil(m\ + m 2 ). 

Note: In the new variables, H is the sum of two terms, one of which in¬ 
volves only the variable c and one of which involves only the 
variable y. The term involving only c is the Hamiltonian for a free 
particle with mass m\ +TO 2 , whereas the term involving only y is the 
Hamiltonian for a particle of mass /r moving in a 1/r potential. 
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2. Let if(x, p) = |p| 2 /(2 /z) — Q 2 / |x| denote the Hamiltonian for the 
classical Kepler problem in M 3 . Show that for every e > 0, the region 
in R 6 given by {(x, p) |i?(x, p) < —e} has Hnite volume. 

3. Let H denote the real span of the following four elements of M^C): 



Show that H forms an associative algebra over 1R, under the op¬ 
eration of matrix multiplication, and that the following relations 
are satisfied: 

•2 -2 l 2 1 

i = j = k = —1 

ij = -ji = k 
jk = -kj = i 
ki = -ik = j. 

The algebra El is (one particular realization of) the quaternion 
algebra. 

Show that each nonzero element of H has a multiplicative in¬ 
verse. 

Hint: Imitate the argument that each nonzero complex number has 
a multiplicative inverse. 

4. Let El denote the quaternion algebra defined in Exercise 3. This ex¬ 
ercise establishes explicitly an isomorphism between the Lie algebras 
so(4) and so(3) © so(3) (compare Definition 16.14). 

(a) Let V be the subspace of El spanned by i, j, and k. Show that 
V forms a Lie algebra under the bracket [a, f3\ = a/3 — /3a and 
that V is isomorphic as a Lie algebra to so (3). 

(b) Let End (El) denote the algebra of real-linear maps of El to it¬ 
self. Given a G V, let L a G End(EI) be the “left multiplication 
by a” map, L a (/3) = a/3, and let R a G End(BI) be the “right 
multiplication by a” map, R a (f3) = /3a. Show that the maps 
a 4 L a and a i —> —R a are Lie algebra homomorphisms of V 
into End(EI). 

(c) Consider the inner product on El in which {l,i,j,k} forms an 
orthonormal basis. Given a G V, show that 


(a) 


(^a/3,7) = - {P,L a 7 ) 

(R a /3, 7 ) = ^ (P, Ral) ■ 
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That is to say, L a and R a belong to so(4), which we identify 
with the space of elements of End(H) that are skew-symmetric 
with respect to the inner product in Part (c). 


(d) 

(e) 


Show that the map (a, f3) H>• L a — Rp is a Lie algebra isomor¬ 
phism of so(3) © so(3) to so(4). 

Let D denote the diagonal subalgebra of so(3) © so(3), that is, 
the set of elements of the form (A', A'). Show that the image of 
D under the isomorphism in Part (d) is the set of elements Y of 
so(4) C End(H) having the following form with respect to the 
basis in Part (c): 


Y = 


0 0 \ 

0 Z ) ’ 


where Z £ so(3). 


5. Describe explicitly the two subalgebras of so(4) corresponding to the 
two copies of so(3) in the isomorphism 

so(4) = so(3) © so(3) 


in Exercise 4. 


6 . Verify Lemma 18.18. 

Hint: First show that £jki£jmn = 0 unless ( k,l ) = (m, n) or (fc, l) = 
(n, m). 

7. In this exercise, we use the summation convention of Sect. 18.4.1. 

(a) Show that for any 3x3 matrix M and any indices j, k, l £ 
{1,2,3}, we have 

E-mnoMjm M kn Mi 0 = £jki(det M). 

(b) Show that if C is a vector operator, then for all R £ SO(3), we 
have 

n(i?)C' fc n(i ?)“ 1 = R lk C h 

(c) Show that the cross product of two vector operators is a vector 
operator. 

Hint: Write the definition of a vector operator in the equivalent 
form 

v c = n(i?)((i?~ 1 v) • c)n(i?)- 1 . 

8 . Compute [ejkiPkJi, Pm] and [Pj, X m / |X|] and show that both of 
these quantities are symmetric in j and m, meaning that the value is 
unchanged if we interchange j and m. 

9. Show that the Eq. (18.14) has two power series solutions for g(p), one 

starting with and one starting with p°. 
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Systems and Subsystems, 
Multiple Particles 


19.1 Introduction 

Up to this point, we have considered the state of a quantum system to 
be described by a unit vector in the corresponding Hilbert space, or more 
properly, an equivalence class of unit vectors under the equivalence relation 
ip ~ e l6, ip. We will see in this section that this notion of the state of a 
quantum system is too limited. We will introduce a more general notion 
of the state of a system, described by a density matrix. The special case 
in which the system can be described by a unit vector will be called a 
pure state. 

One way to see the inadequacy of the notion of state as a unit vector is 
to consider systems and subsystems. We will examine this topic in greater 
detail in Sect. 19.5, but for now let us consider the example of a system of 
two spinless “distinguishable” particles moving in M 3 . (For now, the reader 
need not worry about the notion of distinguishable particles; just think of 
them as being two different types of particles, with, say, different masses 
or charges.) Let us assume the combined state of the two particles can be 
described by a unit vector in the corresponding Hilbert space, which is 
(according to Sect. 3.11) L 2 (R 6 ). We have, then, a wave function ip(x, y), 
where x is the position of the first particle and y is the position of the 
second particle. 

Given a wave function ^(x,y) for the combined system, what is the 
wave function describing the state of the first particle only? If the wave 
function of the combined system happens to be a product, say, ^(x,y) = 

B.C. Hall, Quantum Theory for Mathematicians , Graduate Texts 419 
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ipi(x)ip 2 (y), then, naturally, we would say that the state of the first 
particle is simply ip i. Of course, one might object that we could rewrite 
ip as ^(x,y) = [cipi (x)] [ip 2 (y)/c] for any constant c, but this only affects 
the wave function for the first particle by a constant, which does not affect 
the physical state. 

In general, however, the wave function of the combined system need 
not be a product. Already when ip is a linear combination of two prod¬ 
ucts, ^(x,y) = ipi(x.)ip 2 (y) + 0 i(x) 0 2 (y), it is unclear what the correct 
wave function is for the first particle. At first glance, it might seem nat¬ 
ural to try V’i( x ) + </>i(x), but upon closer examination, this is not an 
unambiguous proposal. After all, we can just as well write ip(x,y) = 
[ci^i(x)][^> 2 (y)/ci] + [c 20 i(x)][</> 2 (y)/c 2 ], but then the resulting wave func¬ 
tions for the first particle, ipi(x) + ip 2 { x ) and Ci" 0 i( x ) + c 2 ^ 2 ( x )j are not 
scalar multiples of one another. For a general unit vector ip in L 2 (R 6 ), the 
situation is even worse. The conclusion is this: There does not seem to be 
any way to associate to ip a, general unit vector ip' in L 2 (R 3 ) such that ip' 
could sensibly be described as “the state of the first particle.” 

Although we cannot associate with ip a wave function ip' for the first 
particle, there is no difficulty in taking expectation values of observables 
related to the first particle. We can make perfect sense of, say, the expected 
position of the first particle, as 

lip,xf ) ip\=( Xj |^(x,y )| 2 dx dy. 

Here Xj 1 ' 1 indicates the operator of multiplication by the jth component 
of the first vector in the function ip {•,•) : R 3 x M 3 —> C. That is to say, 
the operator Xj acting on L 2 (R 3 ) can be “promoted” to an operator on 
L 2 (R 6 ) by having it act in the first variable only. Similarly, the momentum 
operator Pj on L 2 (R 3 ) can be promoted to an operator on L 2 (R 6 ), 

by letting it act on the first variable, meaning that P^ip is —ih times the 
partial derivative with respect to the jth component of the first vector in 
ip(- : •). In fact, as we will see in Sect. 19.5, given any self-adjoint operator 
on L 2 (M 3 ), there is a natural way to promote it into an operator on L 2 (R 6 ), 
where its expectation value may then be defined. 

Thus, although there is no natural way to associate with a unit vector 
ip in L 2 (R 6 ) a unit vector in L 2 (R 3 ), there is a natural way to associate 
with ip expectation values of observables on L 2 (R 3 ). This suggests that we 
should introduce a more general notion of the “state” of a quantum system, 
a notion in which with each “reasonable” family of expectation values for 
the quantum observables there is associated a quantum state. This notion 
turns out to be that of density matrices (positive, self-adjoint operators 
with trace 1 ). 

In Sect. 19.3, we introduce the notion of a density matrix. Theorem 19.9 
in that section will tell us that, given any reasonable assignment <p of 
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expectation values to observables, there is a unique density matrix p such 
that <j>(A) = trace(pA) for all observables A. In the special case in which 
the state of the system is given by a unit vector ip in the Hilbert space, 
then p will be just the projection onto ip and trac e(pA) will be equal to 
the familiar expression {ip, Aip) . In Sect. 19.5, we will consider composite 
quantum systems and introduce a method (the partial trace) of defining a 
density matrix for a subsystem from a density matrix for the whole sys¬ 
tem. Finally, in Sect. 19.6, we will consider the important special case of 
composite systems made up of multiple identical particles. 


19.2 Trace-Class and Hilbert-Schmidt Operators 

In this section, we explore notions related to the trace of an operator on a 
Hilbert space. The results of this section are presented without proof; see 
Chap. VI in Volume I of [34] for proofs and additional information. 

Proposition 19.1 Suppose A £ 13(H) is non-negative and self-adjoint. 
Then for any two orthonormal bases {ey} and {fj} for H, we have 

^{e^Aej) = ^2,{f j ,Af j ). 
j o 

Note that since A is non-negative, (e.j, Aef) and ( fj,Afj) are non-negative 
real numbers. Thus, the sums are always well defined, but may have the 
value of + 00 . 

Definition 19.2 If A £ 13(H) is non-negative and self-adjoint, the value 
ofJ2j i e j>Aej ), for any arbitrarily chosen orthonormal basis, is called the 
trace of A. If trace(A) < +oo, then we say that A is trace class. 

For a general A £ 13(H), we say that A is trace class if the non-negative 
self-adjoint operator V A* A is a trace class. 

Note that for any A £ 13(H), 71*^4 is self-adjoint and non-negative. Thus, 
the square root of A*A may be defined by the functional calculus (Defini¬ 
tion 7.13 or Proposition 8.4). 

Proposition 19.3 

1. If A £ 13(H) is trace class, then for any orthonormal basis {ej}, the 
sum yT {ej,Aej) is absolutely convergent. Furthermore, the value of 
this sum, which we denote as trace(A), is independent of the choice 
of orthonormal basis. 

2. If A £ 13(H) is trace class, then A* is also trace class and 


trace(A*) = trace(^4). 
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3. If A £ 23(H) is trace class, then for all B £ 23(H), the operators AB 
and BA are also trace class, and 

trace(AB) = trace(iM). 

Recall that A £ 23(H) is said to be compact if A maps every bounded 
set in H to a set with compact closure. If a self-adjoint operator A is trace 
class, it is necessarily compact and thus has an orthonormal basis {e^} of 
eigenvectors, for which the associated eigenvalues A j are real and tend to 
zero as j tends to infinity. (See Theorem VI. 16 in Volume I of [34]. One can 
deduce the result from, say, the direct integral form of the spectral theorem 
for bounded self-adjoint operators by verifying that unless A has point 
spectrum with eigenvalues tending to zero, the operator of multiplication 
by A in the direct integral will not be compact.) Point 1 of Proposition 19.3 
then tells us that | Ay | < oo and that trace(A) = JV A j. Conversely, if 

A is a self-adjoint operator having an orthonormal basis of eigenvectors for 
which the associated eigenvalues satisfy | Ay | < oo, then A is trace class. 

Definition 19.4 An operator A £ 23(H) is said to be Hilbert-Schmidt 
if tr&ce(A*A) < oo. 

Since A* A is self-adjoint and non-negative, trace(A*A) is defined (but 
possibly infinite) for any A £ 23(H). If A is trace class, then (by definition) 
the trace of \JA*A is finite, in which case, the trace of V A* A\JA* A is also 
finite, by Point 3 of Proposition 19.3. Thus, every trace-class operator is 
Hilbert-Schmidt (but not vice versa). 

Proposition 19.5 If A £ 23(H) is Hilbert-Schmidt, so is A*. If A, B £ 
23(H) are Hilbert-Schmidt, then AB and BA are trace class and tiace(AB) 
equals tiace(BA). 

If A and B are Hilbert-Schmidt operators, the Hilbert-Schmidt inner 
product of A and B is ( A,B) HS := trace(A*l?) and the Hilbert-Schmidt 
norm of A satisfies ||A|| ffS = (A,A) HS . The space of Hilbert-Schmidt 
operators is a Hilbert space with respect to (•, -) HS . 


19.3 Density Matrices: The General Notion 
of the State of a Quantum System 

Typically, we think of the quantum observables—the ones with expecta¬ 
tions values that we wish to take—as being unbounded self-adjoint oper¬ 
ators. But of course we can also take expectation values of bounded self- 
adjoint operators, and indeed expectations for bounded operators deter¬ 
mine those for unbounded operators. After all, suppose A is an unbounded 
self-adjoint operator and suppose we know the expectation value for 1 e(A) 
for every Borel set E C R, where 1 e is the indicator function of E and 
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1 e(A) is defined by the functional calculus (Definition 7.13). The expec¬ 
tation value for 1e(A) is the probability of obtaining a value in E for a 
measurement of the observable A. If we know this probability for each E, 
then we know the full probability distribution of the measurements, and 
thus we can compute the expectation value of A. Furthermore, we can 
always introduce expectation values for (bounded) non-self-adjoint opera¬ 
tors. Each such operator A is of the form A = A\ + iA 2 with A\ and A 2 
self-adjoint, and so we may reasonably define the expectation value of A to 
be the expectation value of A\ plus i times the expectation value of A 2 . 

We then postulate that the general notion of the “state” of a quantum 
system should be simply a “list” of expectation values for all bounded 
operators, satisfying some reasonable hypotheses. 

Definition 19.6 A linear map $ : 0(H) — > C is a family of expectation 
values if the following conditions hold. 

1 . $(/) = 1 . 

2. <f>(kl) is real whenever A is self-adjoint. 

3. <f>(kl) > 0 whenever A is self-adjoint and non-negative. 

4. For any sequence A n in 0(H), if ||H n 'i/; — Aip\\ —> 0 for all ip £ H, 
then <f>(7l n ) —> <F(yl). 

Point 4 in the definition says that $ is continuous with respect to the 
strong (sequential) convergence in 0(H). By Exercise 3, any linear map 
on 0(H) satisfying Points 1, 2, and 3 is automatically continuous with 
respect to the operator norm topology, meaning that if \\A n — H|| — > 0 
then ^(^.n) —> <f>(kl). However, to establish our characterization of families 
of expectation values in terms of density matrices, we need continuity of 
$ under a more general sort of convergence, where we only assume that 
\\A n ip — Aip || —> 0 for each ip. This stronger continuity property does not 
follow from Properties 1-3. Exercise 5 gives an example of a linear func¬ 
tional on 0(H) that satisfies Points 1-3 of Definition 19.6, but not Point 4. 

Definition 19.7 An operator p E 0(H) is a density matrix if p is self- 
adjoint and non-negative and trace(p) = 1. 

Of course, since the trace of a density matrix is assumed to be finite, every 
density matrix is trace class. The next two results give a precise characteri¬ 
zation of families of expectation values in terms of density 
matrices. 

Proposition 19.8 Suppose p is a density matrix on H. Then the map 
: 0(H) C given by 

<Fp(H) = trac e(pA) = trac e(Ap) 
is a family of expectation values. 
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Proof. If we define <f> p (A) = trac e(pA), then $ p (/) = trace(p) = 1. For 
any A G 13(H), we have, 


trace(pA*) = trace(A*p) = trace((pA)*) = trace(pA). 

It follows that trace(pA) is real when A is self-adjoint. Let p 1 / 2 be the non¬ 
negative self-adjoint square root of p. Then p 1 / 2 and Ap 1 / 2 are Hilbert- 
Schmidt (in the latter case, by Point 3 of Proposition 19.3). It follows that 
trace(^4p 1//2 p 1 / 2 ) = trace(p 1 / 2 4lp 1 / 2 ), by Proposition 19.5. Thus, if A is 
self-adjoint and non-negative, 

trace(pA) = trace(p 1 ^ 2 p 1 ^ 2 v4) = trace(p 1 ' /2 ^4p 1 ^ 2 ) > 0, (19-1) 

because p 1 / 2 Ap 1 / 2 is self-adjoint and non-negative. We have established 
that satisfies Points 1, 2, and 3 of Definition 19.6. 

Meanwhile, suppose A n xp converges in norm to Aip, for each ip in H. 
Then ||A„'!/>|| is bounded as a function of n for each fixed ip. Thus, by the 
principle of uniform boundedness (Theorem A.40), there is a constant C 
such that ||A„|| < C. Now, if {ey} is an orthonormal basis for H, we have 



A/A 


i > A n p 


V A 


< C 


A/A 


and, 



o o 


( ej,pej) = trace(p) < oo. 

3 


Furthermore, since A n (p 1 / 2 ej) converges to A(p 1 ^ 2 ej) for each j, dominated 
convergence tells us that 


trace(p 1 ^ 2 Ap 1 / 2 ) = ^ (ej, p 1 ^ 2 Ap 1/ " 2 ej'^ 

3 

= lim 'Y' (ej,p 1/2 A n p 1/2 e 

n—^oo L ' \ 

3 

= lim trace(p 1 / 2 A„p 1,/2 ). 


As in (19.1), we can shift the second factor of p 1 / 2 to the front of the trace 
to obtain Point 4 in Definition 19.6. ■ 


Theorem 19.9 For any family of expectation values $ : 13(H) — > C, there 
is a unique density matrix p such that <f>(A) = trace(pA) for all A G 13(H). 

Proof. Recall from Sect. 3.12 the Dirac notation, in which the expression 
I^XV’I denotes the linear operator taking any vector y G H to the vec¬ 
tor \(p){ip |x) (in physics notation), that is, the vector {ip,x)(p (in math 
notation). If p is trace class, then by Exercise 2, 

trace(p |0X^|) = {ip,p<p} ■ 
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Thus, if an operator p with the desired properties is to exist, we must have 

(V’.W’} = $(10X^1)- 

Now, by Exercise 3, $ satisfies ||$(A)|| < ||A|| . From this, we can see 
that the map 

■- $(10X^1) 

is a bounded sesquilinear form, so that (by Proposition A.63), there is 
a unique bounded operator p such that $(|</>)( 1 /’l) = (ip,p(p) for all 4 1 
and ip. Since \<f>){4>\ is self-adjoint and non-negative, (</>, <j>) is real and 
non-negative, which means that p is self-adjoint (by Proposition A.63) and 
non-negative. 

Meanwhile, if {ey} is an orthonormal basis for H, then by Definition 19.2, 

N 

trace(p) = (ej,pej) 

°° i= i 

= lim <f> (|ei)(ei| H-h |ejv)(ejv|) 

N—xx 

= $(/) = 1 . 

In passing from the second line to the third, we have used Point 4 of 
Definition 19.6. Thus, p is a density matrix. 

We have now found a density matrix p such that <1>(| (/>)(?/> |) agrees with 
trace(p \(p){ip\) for all <p,ip £ H. By linearity, 4?(A) = trace(pA) for all finite- 
rank operators A (see Exercise 4). Now, if {ej} is an orthonormal basis for 
H, let Pn be the orthogonal projection onto the span of ei,..., ejv. Then 
for any A £ 0(H), the operator PjyA has finite rank and PnAi/j —> Atp for 
all ip £H. Thus, for all A e 0(H), 

<I>(A) = lim $(PnA) = lim trace(pP/vA) = trace(pA), 

N—too N—> oo 

by Proposition 19.8 ■ 

Our next result shows that our new notion of the state of a system 
includes our old notion. 

Proposition 19.10 For any unit vector ip £ H, let \ip){ip \, in accordance 
with Notation 3.29, denote the orthogonal projection onto the span of ip. 
Then \ip){ip\ is a density matrix and for all A £ 0(H), we have 

trace(|t^XV*! = i^iAip). 

Note that if ip 2 = e l9 i/.’i } then \ipi)(ipi\ — \ 1 p 2 \ 1 p 2 \ ■ Thus, from our new 
point of view, we may say that the reason ipi and ip 2 represent the same 
“physical state” is that they determine the same density matrix. 

Proof. Since it is an orthogonal projection, \ip){ip\ is bounded, self-adjoint, 
and non-negative. To compute its trace, we choose an orthonormal basis 
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{ej} for H with ei = if, which gives trace(|'0XV , l) = 1- Using the same 
orthonormal basis, we compute that, for any A £ 13(H), 

traced if){if\A) = ^(e j ,if)(if,Ae j ) = (if,Aif ), 

3 

as desired. ■ 

Definition 19.11 A density matrix p £ 13(H) is a pure state if there 
exists a unit vector if £ H such that p is equal to the orthogonal projection 
onto the span of if. The density matrix p is called a mixed state if no 
such unit vector if exists. 

An isolated system that is in a pure state initially will remain in a pure 
state for all later times, since the initial state ifo evolves to the pure state 
e.- lHt / h ifo, where H is the Hamiltonian for the system. But if a system is 
interacting with its environment, then as discussed in Sect. 19.5, the system 
may move into a mixed state at a later time. 

There are several different ways of characterizing the pure states as a 
subset of the density matrices. First, it is not hard to see (Exercise 6) that 
a density matrix p is a pure state if and only if trace(p 2 ) = 1. Second, the 
set of density matrices is a convex set, since if p\ and p 2 are non-negative 
and have trace 1, then so is Api + (1 — A)p 2 , for 0 < A < 1. According to 
Exercise 7, the pure states are precisely the extreme points of this set. That 
is, a density matrix p is a pure state if and only if it cannot be expressed 
as p = Xpi + (1 — X)p 2 where p\ and p 2 are distinct density matrices and 
A belongs to (0,1). Third, we may define the von Neumann entropy S(p) 
of a density matrix p by 


S(p) = trace(—plog p), 

where plogp is defined by the functional calculus. (Since lim A _ > , 0 + A log A = 
0, we interpret OlogO as being 0.) Since the eigenvalues of p are all be¬ 
tween 0 and 1, we see that —plogp is a non-negative self-adjoint operator, 
which has a well-defined trace, which may have the value + 00 . According 
to Exercise 8, a density matrix p is a pure state if and only if S(p) = 0. 

Suppose that we have two pure states, coming from unit vectors ifi and 
if 2 - Then there are two different senses in which we can take a superposition, 
that is, linear combination, of the corresponding quantum states. If we use 
our old point of view, in which the states are vectors in H, then we may take 
the linear combination C\ifi + C 2 if 2 , and then normalize this vector to be a 
unit vector. If we use our new point of view, in which the states are density 
matrices, then we may take the linear combination c\ \ifi)(ifi\ + C2 \if2)(ip2\ > 
where in this case C\ and C 2 should be non-negative and should add to 1. 
These two notions of superposition are different, since 


C |CiV>l + C2lp2){c\lfl + c 2 lf2\ ± Cl \lfl){lfl\ + C2 I^X^I , 


(19.2) 
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no matter how the constant C is chosen. After all, the state on the left- 
hand side of (19.2) is a pure state, whereas (unless ip 2 is a multiple of ip l), 
the state on the right-hand side of (19.2) is a mixed state, since the range 
of this operator is 2-dimensional rather than 1-dimensional. 

Physicists call the first sort of superposition (in which we take a linear 
combination of vectors in H) coherent superposition or quantum superpo¬ 
sition., and they call the second sort of superposition (in which we take a 
linear combination of the associated density matrices) incoherent superpo¬ 
sition. The reason for the term “coherent” is that coherent superposition 
depends on the phases of the coefficients. That is, if ip\ and ip 2 are linearly 
independent, the vector Cie l9 ipi + C 2 e l ^ip 2 does not represent the same 
quantum state as C\ipi + C 21 P 2 , unless e l6 = eA By contrast, the density 
matrix associated with e l9 ip is the same as the density matrix associated 
with ip , and so the phases have no effect when taking linear combinations 
of the density matrices associated to vectors in H. When taking a coher¬ 
ent superposition, there is no simple relationship between the expectation 
value of an observable in the states ipi and ip 2 and the expectation value 
of the same observable in the state C\ip\ + C 2 ip 2 - On the other hand, when 
taking an incoherent superposition, expectation values in the new state are 
just linear combinations of the original expectation values: 

trace ((ci \ 1 p 1 \ 1 p 1 \ + c 2 lAXADA = a {ip-^Aipi) + c 2 ( ip 2 ,Aip 2 ) ■ 

19.4 Modified Axioms for Quantum Mechanics 

We may now modify the axioms of quantum mechanics introduced in 
Sect. 3.6 to incorporate density matrices, beginning with our revised no¬ 
tion of a state. 

Axiom 6 The state of a quantum system is described by a density matrix p 
on an appropriate Hilbert space H. If A is any bounded operator on H, the 
expectation value of A in the state p is given by the quantity trace(pA) = 
trace(Ap). 

In Axiom 6, we assume that A is bounded, so that trace(pA) and trace(Ap) 
are defined and equal by Proposition 19.3. If A is unbounded and self- 
adjoint, we can construct a probability measure describing the proba¬ 
bilities for measurements of A in the state p, by the formula 

Pp{E) = trace(pl B (A)), 

where 1 e{A) is defined by the functional calculus. 

We then define the expectation value of A in the state p as f R A dp^( A), 
provided the integral is absolutely convergent. If the integral is absolutely 
convergent, it is reasonable to hope that both pA and Ap will be densely 
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defined and bounded, that (the bounded extension to H of) these operators 
will be trace class, and that both trace(pA) and trace(Ap) will coincide with 
J R A dp^(X). We will not, however, enter into an investigation of this issue. 

Next, we propose a variant of Axiom 4, describing the “collapse of the 
wave function.” 


Axiom 7 Suppose a quantum system is initially in a state p and a mea¬ 
surement of a self-adjoint operator A with point spectrum is performed. If 
the measurement results in the value A for A , then immediately after the 
measurement, the system will be in the state p ', where 

p' = l\p!\. 

Here P\ is the orthogonal projection onto the X-eigenspace of A and Z = 
trac e(P x pP\). 

Note that if p is non-negative, self-adjoint, and trace class, then P\pP\ 
is also non-negative, self-adjoint, and trace class. Implicit in Axiom 7 is 
the assumption that the measurement can only result in values A for which 
P\pP\ is nonzero. In particular, A must be an eigenvalue for A. 

Finally, we introduce the notion of time-evolution for our new notion of 
“state.” 


Axiom 8 The time evolution of the state of the system is described by the 
following equation for a time-dependent density matrix p(t): 


This equation may be solved, formally, by setting 

pit) = e - ltfl / R p 0 e itk/n , 
where po is the state of the system at time t = 0. 


(19.3) 


(19.4) 


There are some domain issues involved in the interpretation of the equa¬ 
tion (19.3). Rather than entering into an examination of those issues here, 
we will simply take (19.4) as the definition of the time-evolution of a den¬ 
sity matrix. Presumably, if po is nice enough, then the map t H > p(t) will be 
differentiable as a curve in the Banach space 15(H) and its derivative will 
be (an extension of) the operator on the right-hand side of (19.3). By com¬ 
parison, it follows from Stone’s theorem and Lemma 10.17 that the family 
of pure states ip(t) := o satisfies the Schrodinger equation in the 

natural Hilbert space sense if and only if i/jq belongs to the domain of H. 
To see that the time-evolution in (19.4) is consistent with the previously 
defined time-evolution of pure states, observe that 

e~ it6/n IMV’ol e itA ' K = \e- it6 ^oXe- it6/ ^o\ = IV’WXV’WI, 
since { e itfl / n )* = e ~ it6 / n . 
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It should be noted that (19.3) differs by a minus sign from the time- 
evolution in the Heisenberg picture of quantum mechanics (Definition 3.20). 
Although this difference may seem strange, keep in mind that in Axiom 8, 
we are not adopting the Heisenberg point of view, in which the states 
are independent of time and the observables evolve in time. Rather, we 
are adopting a modified version of the Schrodinger picture, in which it 
is the states that evolve in time, but where the states are now certain 
sorts of operators. Even though both the states and the observables are 
now operators, the observables (in the Heisenberg picture) and the states 
(in the Schrodinger picture) must evolve in opposite directions in time, in 
order for the expectation values of the observables to be the same in the 
two pictures. 


19.5 Composite Systems and the Tensor Product 

As discussed in Sect. 3.11, the Hilbert space for two (nonidentical, spinless) 
particles moving in R 3 is L 2 (R 6 ). Given a unit vector (i.e., a pure state) 
ip in L 2 (R 6 ), the quantity |'0(x 1 ,x 2 )| represents the joint probability dis¬ 
tribution for the position x 1 of the first particle and the position x 2 of 
the second particle. The following result shows that L 2 (R 6 ) is naturally 
isomorphic to the Hilbert tensor product of two copies of the Hilbert space 
for the individual particles, namely L 2 (R 3 ). 

Proposition 19.12 Suppose that and (X 2 ,p. 2 ) are a-finite 

measure spaces. Then there is a unique unitary map 

p : L 2 (Xi, pi)(&L-(X2 1 p.2) ~t L 2 (X \ x X2, yi x ^2) 


such that 

p(<j) ® ip)(x, y) = cp(x)ip(y) 

for all <p £ L 2 (Xi,pi) and ip £ L 2 (X2,10-2)- 

Here <3 denotes the Hilbert tensor product defined in Appendix A.4.5. 
Proof. For simplicity of notation, we suppress the dependence of L 2 spaces 
on the measure, writing, say, L 2 (X 1 ) rather than L 2 (Xi,pi). Consider first 
the algebraic (i.e., uncompleted) tensor product L 2 (X 1 )®L 2 (X 2 ). Using the 
universal property of tensor products, we can construct a linear map p of 
L 2 (Xi) ® L 2 {X2) —> L 2 (X 1 x A' 2 ) determined uniquely by the requirement 
that 

p{(p 3 ip)(x , y ) = (p(x)ip{y). 

Now, every element of the algebraic tensor product L 2 (X 1 ) ® L 2 (X 2 ) can 
be expressed as a linear combination of elements of the form <pj ® ipj , with 
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4>j £ L 2 (X i) and ipj in L 2 (X 2 ). By computing on such linear combina¬ 
tions, we can easily verify that p is isometric. Thus, by the bounded linear 
transformation (BLT) theorem (Theorem A.36), p has a unique isometric 
extension to a map of the completed tensor product L 2 (Xi)®L 2 {X 2 ) into 
L 2 {X 1 x X 2 ). 

It remains only to show that p is surjective. Since both measures are 
cr-finite, it is a simple exercise to reduce the problem to the case where p± 
and p 2 are finite, which we henceforth assume. Suppose ip £ L 2 (X 1 x X 2 ) 
is orthogonal to the image of p. Then ip is orthogonal to the indicator 
function of every measurable rectangle, and hence to the indicator function 
of any finite disjoint union of measurable rectangles. The collection A of 
such disjoint unions is an algebra of sets. Let At denote the collection of 
measurable subsets E of X± x X 2 such that the integral of ip over E is zero. 
Then At is a monotone class containing A. By the monotone class lemma 
(Theorem A. 8), Ai contains the cr-algebra generated by A, which is the 
cr-algebra on which p,\ x P 2 is defined. Thus, the integral of ip over every 
measurable set is zero, which implies that ip is zero almost everywhere. ■ 
The preceding example suggests the following general principle. 

Axiom 9 The Hilbert space for a composite system made up of two sub¬ 
systems is the Hilbert tensor product H 10 H 2 of the Hilbert spaces Hi and 
H 2 describing the subsystems. 

If A and B are bounded operators on Hi and H 2 , respectively, then there 
is a unique bounded operator A 0 B on H 10 H 2 such that 

(A ® B)(p ® ip) = (Acp) 0 (Bip) 

for all <p £ Hi and ip £ H 2 . (See Appendix A.4. 5.) 

Theorem 19.13 Suppose that p is a density matrix on H 10 H 2 . Then 
there exists a unique density matrix p W on Hi with the property that 

trace(p^A) = trace(p(A 0 /)) (19.5) 

for all A £ Z?(Hi). We call p i 1 ) the partial trace of p with respect to H 2 . If 
{fk} is an orthonormal basis for H 2 , then the operator pW satisfies 

(cp, p (1 V) = (0 0 fk , p{ip 0 fk)) (19.6) 

k 

for all cp,ip £ Hi. Similarly, there is a unique density matrix p^ on H 2 
satisfying trac e(p^B) = trace(p(/ 0 B)) for all B £ B(H 2 ). If {ey} is an 
orthonormal basis for Hi, then p^ satisfies 

((p, P {2) ip) = 53 (ej ( 8 ) 0 , p(ej ( 8 ) ^)) ( 19 . 7 ) 

3 


for all (p,ip £ H2. 
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The motivation for the terminology “partial trace” is provided by (19.6) 
and (19.7), which are similar to the formula for the trace of an operator, 
except that the sums range only over a basis for one of the two Hilbert 
spaces. One special case of Theorem 19.13 is the one in which the density 
matrix p is of the form p = pi ® p 2 , where p\ and p 2 are density matrices on 
Hi and H 2 , respectively. (Any operator p of this form is a density matrix 
on Hi x H 2 .) In that case, it is not hard to see that p ^ = pi and p^ = p 2 . 
We may describe this case by saying that the state of the first system is 
“independent” of the state of the second system. 

Lemma 19.14 For any sequence A n G £>(Hi), if \\A n if — Aip\\ 0 for 
some A G B(H) and all if G Hi, then 

||(A„# 7)0- (A® J)(/>|| 0 

for all 0 G Hi ®H 2 . A similar result holds for operators of the form I(&B n . 


Proof. See Exercise 9. ■ 

Proof of Theorem 19.13. The existence and uniqueness of p 1 ' 1 ' 1 and p l2 > 
follow from Lemma 19.14 and Theorem 19.9. Meanwhile, if {ej} is an 
orthonormal basis for Hi and {fk} is an orthonormal basis for H 2 , we 
have 


( 0 ,p ( 1 ) 0 ) 


trace(p^ | 0 )( 0 |) 

0 /*> 0 7 )( e j 0 fk)) 

^2 0 f k ’ e o) 0 fk)) 



® fk, p(1f ® fk) 


5 Z(^ 0 fk,p{^® fk)) • 

k 


This is the desired formula for (0, pAltf) . Note that because p is trace class 
and 10X01 ® 7 is bounded, p(|0)(0| <g> 7) is trace class, in which case the sum 
in the second line is absolutely convergent, by Proposition 19.3. Thus, we 
are allowed to rearrange the sum freely. ■ 

Suppose we have two quantum systems with Hilbert spaces Hi and H 2 
and Hamiltonians H\ and T7 2 . If the two systems do not interact with each 
other and the composite system is initially in a (pure) state of the form 
0o 8 * 00 ) then we expect that at some later time, the composite system will 
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be in the state 4>{t) 0 4>{t), where </>(t) = e ltBl / h ip 0 and ip(t) = e lt ^/ h . 
Ignoring domain considerations, we may compute that 

ih^ [4>(t) 0 if{t)\ = (Hi<j>(t)) 0 ip(t) + cj>{t) 0 (H 2 ip(t)) 

= {Hi ® I + I ® H 2 ){4>{t) ® 4>{t)). 

This calculation suggests that the correct Hamiltonian for a noninteracting 
composite system is the operator Hi ® I + I ® H 2 ■ 

It is not, however, obvious how to select a domain for Hi ® I + I ® H 2 
in such a way that this operator will be self-adjoint. (The reader is invited 
to try to choose such a domain “by hand.”) The easiest way to deal with 
this issue is to use Stone’s theorem, as in the following definition. 

Definition 19.15 If A and B are self-adjoint operators onHi and H 2 , de¬ 
fine the operator A® I +/0.B to he the infinitesimal generator of the strongly 
continuous one-parameter unitary group e ltA 0 e ltB . Thus, by Stone’s the¬ 
orem, A ® 7 + / 0 .B is self-adjoint. 

It is not hard to check that e ltA ® e ltB is indeed strongly continuous. In 
the case B = 0, the operator A ® I is defined as the infinitesimal generator 
of e ltA ®I. If A and B happen to be bounded, then A0/ + /0B defined by 
Definition 19.15 coincides with A® I + I® B defined as the sum of tensor 
products of bounded operators, as in Sect. A.4.5. 

Axiom 10 Suppose Hi and H 2 are the Hilbert spaces for two quantum 
systems, with Hamiltonians Hi and H 2 , respectively. Then the Hamiltonian 
for the noninteracting composite system is Hi®I+I®H 2 , where the domain 
of Hi ® I + I ® H 2 is as in Definition 19.15. 

A physicist would write Hi ® I + I ® H 2 simply as Hi + H 2 , with the 
understanding that Hi acts only on the first factor in the tensor product 
and H 2 acts only on the second factor. 

In general, the two components of a composite system will interact, in 
which case the Hamiltonian for the composite system is typically of the 
form 

H = Hi®I + I®H 2 + Hi nt , 

where H lnt is an “interaction term.” Often, the interaction term may be 
considered “small” compared with the other terms in the Hamiltonian. 
Consider, for example, a system consisting of particles in a box, with a 
barrier dividing the box in half. Suppose the particles interact by means of 
a two-particle potential of the form Ylj<k V ~ xfc ) (Sect. 2.3.2) and that 
14(x J — x fc ) is very small unless the two particles are close together. There 
will typically be far more pairs of nearby particles in which the two particles 
are on the same side of the box than nearby pairs on opposite sides. Thus, 
even though the interaction between the two systems may substantially 
affect the behavior of the composite system over long periods of time, it is 
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still reasonable to think of H\ <E> / as “the energy of the first subsystem” 
and I (g> H 2 as “the energy of the second subsystem.” 

Suppose we start out in a state p of the composite system for which 
the state p*' 1 -* of the first subsystem is a pure state. If the system is an 
interacting one, the first subsystem will probably not remain in a pure 
state at later times. Indeed, suppose that the second subsystem is very 
large system having temperature T. Then, according to the postulates of 
quantum statistical mechanics, we are supposed to believe that once the two 
systems have reached thermal equilibrium, the state of the first subsystem 
will be given by the following highly mixed state: 


P (1) = 


m 


0-PH1 


(19.8) 


Here /3 = l/fasT), where ks is Boltzmann’s constant, and Z{f3) is a nor¬ 
malization constant, known as the partition function of the theory, given 
by Z(/3) = trace(e _ ^ J ^ 1 ). 

Of course, for this idea to make sense, e _/3fll must be trace class. This 
will be the case provided that H\ has discrete spectrum with eigenvalues 
tending to +00 at some reasonable rate. Thus, in quantum statistical me¬ 
chanics, the expectation value of some observable A for the first subsystem 
will be (once equilibrium is reached) 

(A) = -^trac e(e -/3 ^M). (19.9) 

Zj 


In particular, when A = Hi, (19.9) provides a natural generalization of 
Planck’s model of blackbody radiation; compare Exercise 2 in Chap. 1. 


19.6 Multiple Particles: Bosons and Fermions 

As discussed in Sect. 17.8, each type of particle (electron, proton, neutron, 
etc.) has a spin l, where the possible value for l are 


The Hilbert space for a particle moving in R 3 and having spin l is L 2 (R 3 )® 
Vi , where V) is a finite-dimensional Hilbert space that carries an irreducible 
projective unitary representation of SO (3) of dimension 21 + 1. There is a 
natural unitary identification of L 2 (R 3 )(g)Vz with T 2 (R 3 ;Vz), the space of 
square-integrablc functions on R 3 with values in V), in which the element 
ip 0 v of L 2 (R 3 )<§Vi is identified with the function 

x 1—^ ip(x.)v 


in L 2 (R 3 ; Vi). 
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Now, we have already mentioned, in Sect. 3.11, the idea that in quantum 
mechanics, identical particles are indistinguishable. Let us think about this 
in the case of two identical particles with spin l. Our first guess as to 
the Hilbert space for such a system is the tensor product of two copies of 
L 2 (R 3 ; Vi), which may be identified with 

L 2 (R 6 ; Vi ® V)). 

If ip is a unit vector in this space, thought of as a pure state, then saying that 
the two particles are “indistinguishable” means that ^(x^x 1 ) should rep¬ 
resent the same physical state as ^(x: 1 , x 2 ), that is, ^(x^x 1 ) = c^(x 1 ,x 2 ) 
for some nonzero constant c. Applying this rule twice shows that c must 
be either 1 or — 1. 

A variety of theoretical and experimental considerations suggest the fol¬ 
lowing principle: For particles with integer spin (l = 0,1,...), the constant 
c in the preceding paragraph is 1, whereas for particles with half-integer 
spin (l = 1/2,3/2,...) the constant c is —1. Particles with integer spin 
are called bosons and particles with half-integer spin are called fermions. 
We encode the discussion in the two preceding paragraphs in the following 
axiom. 

Axiom 11 Consider a collection of N identical particles moving in R 3 
and having integer spin l. Then the Hilbert space for such a collection is the 
subspace of L 2 ( R 3Ar ; (Vi)® N ) consisting of those square-integrable functions 
y') for which 

V'(x CT(1) , x-( 2 ),..., x-W) = VKx 1 , x 2 ,..., x") 

for every permutation a. Consider also a collection of N identical particles 
moving in R 3 and having half-integer spin l. Then the Hilbert space for 
such a collection is the subspace of L 2 (M. 3N ; (Vi)® N ) consisting of those 
square-integrable functions if for which 

V’(x' T(1) , x-( 2 ) ,..., x ff W) = sign(cr)V'(x 1 , X 2 ,..., x w ) 
for every permutation cr. 

One may well ask why Axiom 11 holds. More specifically, one may first 
ask why it is that identical particles are indistinguishable, and then sepa¬ 
rately ask why integer-spin particles are bosons and half-integer-spin par¬ 
ticles are fermions. Both questions are best answered from the point of 
view of quantum field theory, to which ordinary nonrelativistic quantum 
mechanics is an approximation. 

In field theory, one starts with a “classical” field theory, meaning a dif¬ 
ferential equation for functions </>(x, t) on R 4 with values in some finite¬ 
dimensional vector space. Electromagnetic fields, for example, are—at any 
one fixed time —functions on R 3 with values in R 6 , where R 6 describes 
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the three components of the electric field and the three components of the 
magnetic field. These functions on R 3 then evolve in time according to 
Maxwell’s equation. In quantum field theory, one regards, say, Maxwell’s 
equations as a sort of infinite-dimensional dynamical system, which we may 
quantize in something like the way we quantize Newton’s equation to get 
ordinary nonrelativistic quantum mechanics. In the quantum version of 
Maxwell’s equations, the energy in each mode of the fields is “quantized,” 
meaning that one can only add energy to each mode in multiples of a certain 
unit (or “quantum”) of energy. This is analogous to the quantum harmonic 
oscillator, in which the allowed energies differ by integer multiples of the 
tvjj. In quantum field theory, then, a particle is one quantum of excitation 
of a certain field. 

For simplicity, let us think of a field theory in which the classical fields 
take values in R. Then even at the classical level, it is possible to think 
that we have something like particles, namely localized bumps in the field 
4>{x) located at several different points in space. These bumps might, for 
example, be in the shape of a Gaussian wave-packet, that is, a Gaussian en¬ 
velope multiplied by a sinusoidally oscillating function. From this point of 
view, we can gain some understanding of why identical particles are indis¬ 
tinguishable. Suppose we have a Gaussian wave packet near a point a in R 3 
and then an identically shaped Gaussian wave packet near another point b. 
The state <(>(x) of the field is precisely the same as if we have a packet near 
b and then also a packet near a. That is to say, there is no distinct state of 
the system that corresponds to interchanging the two particles; whichever 
bump we think of as the “first” particle, we have the same field </>(x). Even 
in the quantum version of such a system, there no meaning to asking which 
is the first particle and which is the second. Thus, even in nonrelativistic 
quantum mechanics, which is a low-energy approximation to quantum field 
theory, we expect identical particles to be indistinguishable. 

Although the preceding discussion does not explain the distinction be¬ 
tween bosons and fermions, that distinction also emerges from quantum 
field theory, through something called the spin-statistics theorem 
(see, e.g., [38]). 


19.7 “Statistics” and the Pauli Exclusion Principle 

The spin of an electron is equal to 1/2 and electrons are, therefore, fermions. 
The famous Pauli exclusion principle is a consequence of the fermionic 
nature of electrons. Pauli’s principle states that two electrons cannot be 
in the same state at the same time. This means that if ip is a square- 
integrable, C 2 -valued function on R 3 (which could describe the state of a 
single electron), then the function 'F : R 6 —► C 2 <gt C 2 given by 

T(x 1 ,x 2 ) = ^(x 1 ) 0 tHx 2 ) 
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is not a possible state of a two-electron system, since T does not satisfy 
Axiom 11. On the other hand, if ipi and ip 2 are two linearly independent 
elements of L 2 (R 3 ; C 2 ), then the function $ : R 6 —> C 2 ® C 2 given by 

$(x\x 2 ) = ^(x 1 )^^ 2 ) - ip 2 (x 3 )ipi (x 2 ) (19.10) 

is a possible state of a two-electron system. [If ipi and ip 2 are indepen¬ 
dent, then $ is a nonzero clement of L 2 (R 6 ;C 2 ® C 2 ), which can then be 
normalized to be a unit vector. See Exercise 10.] 

Let us try to understand the implications of the Pauli exclusion principle 
for multielectron atoms. Let us model an A-electron atom as having a 
nucleus with positive charge Nq , where the charge of a single electron is 
—q. Since the nucleus is much more massive than the electrons, we can 
treat the nucleus as being fixed and the electrons as moving in potential 
of the form —Nq/ |x| . As a very rough approximation to the structure of 
such an atom, we can ignore the electron-electron interaction and take a 
Hamiltonian of the form 



where A- 7 is the Laplacian acting on the jth variable. That is, we are taking 
our Hamiltonian to be simply 

+ + - , 

where H is the Hamiltonian for a single electron. 

If, say, N is even, the lowest-energy state for this Hamiltonian in the 
antisymmetric subspace of L 2 (R 3Ar ; (C 2 )®^) will be 

^(xSx 2 ,...^) 

= AS (V’o" ( xl ) ® i>o (x 2 ) ® Ipf (x 3 ) 0 • ■ • 0 V^/^x^ 1 ) ® ip„ /2 (x N )) • 

(19.11) 

If N is odd, the product ends with ?/;) l ) v+1 ^ 2 (x Ar ). The notation in (19.11) 
is as follows. First, AS is the antisymmetrization operator, given by 

AS(/)(x 1 ,...,x JV )= £ sign(^/(x-Wx^.-.^W). 
a&S N 

Second, the functions ipo,ipi,ip 2 , ■ ■ ■ are the eigenvectors in L 2 (R 3 ) for the 
Hamiltonian of a single particle in R 3 moving in a potential of the form 
—Nq 2 / |x| , arranged so that the eigenvalues of ipj are weakly increasing 
with j. The ipj’ s are just the states computed in Chap. 18, but with q 
replaced b yyfNq. Third, ^/( x ) denotes ipj(x) ® ei and ipj (x) denotes 
ipj(x) ® e 2 , where {ei,e 2 } is the standard basis for C 2 . 
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What the expression for U/o means is that, if we ignore (at first) the inter¬ 
action between the electrons, but retain the Pauli exclusion principle, then 
we put the first electron into the ground state of the single-electron system, 
with “spin up” (i.e., tensored with ei). Then we put the second electron 
into the ground state with “spin down” (tensored with e 2 ). Then the third 
electron goes into the first excited state of the single-electron system with 
spin up, and so on. Of course, this model of a multielectron atom is very 
rough, since the interaction between the electrons actually plays a signif¬ 
icant role. Nevertheless, this model highlights the critical role played by 
the exclusion principle, which forces successive electrons to go into higher 
and higher energy states. In particular, this crude approximation suggests 
(correctly!) that even for more realistic models of a multielectron atom, the 
lowest energy level in the antisymmetric subspace of L 2 ( R 3JV ; (C 2 )® w ) is 
much higher than the lowest energy level of the same Hamiltonian in all of 
L 2 (R 3N ; (C 2 ) 0JV ). 

Meanwhile, in quantum statistical mechanics, one considers a large col¬ 
lection of identical particles confined to some finite region of space. If the 
system is isolated (rather than in thermal equilibrium with its environ¬ 
ment), the goal of statistical mechanics is to “count” the number N(E) of 
quantum states with energy less than E , as a function of E. [That is, N(E) 
is number of eigenvalues for the Hamiltonian less than E, counted with their 
multiplicity.] As the preceding discussion of the Pauli exclusion principle 
suggests, we will get very different answers for N(E) if the particles are 
fermions than if they are bosons. Bosons are said to follow Bose-Einstein 
statistics , whereas fermions are said to follow Fermi-Dirac statistics. The 
term “statistics” here refers to the different behavior of the two types of 
particles in quantum statistical mechanics. The spin-statistics theorem in 
quantum held theory tells us that particles with integer spin have to be 
bosons (obeying Bose-Einstein statistics) and particles with half-integer 
spin have to be fermions (obeying Fermi-Dirac statistics). 

One fascinating example of quantum statistical mechanics occurs when 
the particles are bosons and the interaction between particles is negligible. 
In that case, the lowest energy state will simply be 


'P 0 (x 1 ,x 2 , • • - ,x 7V ) = ipo (x 1 ) 0 ^ 0 (x 2 ) 0 • • • 0 ipo(* N ), 


where i/jq is the ground state of the single-particle system. Now, quantum 
statistical mechanics tells us that at a given temperature, the state of the 
system will be an (incoherent) superposition of the ground state and the 
various excited states. If the temperature is low enough, then the coeffi¬ 
cient of the ground state will be close to 1, and thus, “all the particles are 
in the ground state.” A system in such a state is called a Bose-Einstein 
condensate , a state that was predicted on theoretical grounds by Satyendra 
Nath Bose and Einstein in the 1920s. Bose-Einstein condensates were first 
observed experimentally in laser-cooled gases in June 1995 by Eric Cornell 
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and Carl Wieman, in work for which they, along with Wolfgang Ketterle, 
were awarded the 2001 Nobel Prize in physics. 


19.8 Exercises 

1. Suppose that X is a Hilbert-Schmidt operator on H and that { ej } is 
an orthonormal basis for H. Show that 

imiL = Ei< e ;’ x e fc >i 2 - 

j,k 


2. Given (p,ip £ H, let \(p)(ip\ denote the operator defined in Notation 3.28. 
Show that if A £ 6(H) is trace class, then 

trace(A \(p)(ip\) = {ip, Ac/)). 

Hint: If {ej} is an orthonormal basis for H, then for any \ £ H, we 
have x = E j ( e j>x) tj- 

3. Suppose $ : 6(H) -> C is a linear functional with the properties 
(1) that <1>(A) is real whenever A is self-adjoint and (2) that <b(A) 
is real and non-negative whenever A is self-adjoint and non-negative. 
Show that if A is self-adjoint, then 

-||A||$(J)<$(A)<||A||$(J). 

Conclude that $ is bounded relative to the operator norm on 6(H). 

Hint: Show that if A is self-adjoint, then || A|| I + A and ||A|| I — A are 
non-negative. 

4. An operator A £ B{ H) is said to have finite rank if range(A) is finite 
dimensional. 

(a) Show that if A € 6(H) has finite rank, then so does A*. 

(b) Given A £ 13(H), show that A has finite rank if and only if there 
exist vectors (pi,..., cppf and ipi,... ,ipN such that 

A = l<M(^i| H-b \(Pn){iPn\ ■ 

(c) Let A be any element of 6(H), let {ej} be an orthonormal basis 
for H, and let P/v be the orthogonal projection onto the span 
of e\,... ,ejy. Show that P/v A has finite rank and that for all 
ip £ H, we have 


lim ||PiaAip — Aip || = 0. 

TV—boo 
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Note : This result shows that each bounded operator can be ex¬ 
pressed as a strong limit of finite-rank operators. By contrast, 
if dim H = oo, then Part (a) of Exercise 5 shows that not every 
bounded operator can be expressed as an operator-norm limit 
of finite-rank operators. 

5. In this exercise, assume that dimH = oo. 

(a) Show that if A has finite rank, then ||Al + cl\\ > |c| for any c £ C. 
(With c = — 1, this shows that I is not an operator-norm limit 
of finite-rank operators.) 

(b) Let /C(H) denote the closure of the finite-rank operators with 
respect to the operator norm on 0(H). Let V denote the space 
of operators of the form B + cl, with B £ /C(H). Define a linear 
functional $ : V —> C by <I>(I? + cl) = c for all B £ /C(H). Show 
that |$(A)| < ||A|| for all A £ V. 

Note: It can be shown that /C(H) is precisely the space of 
compact operators on H. 

(c) Let U/i : 13(H) — > C be any linear functional such that T j = $ on 
V and such that |\I/i(Al)| < ||A|| for all A £ 0(H). (Such a func¬ 
tional exists by the Hahn-Banach theorem.) Let : 0(H) —>• C 
be defined by 

T 2 (A) = i($i(^) + $i(A*)). 

Show that T 2 satisfies Properties 1, 2, and 3 of Definition 19.6, 
but that there does not exist any density matrix p such that 
^2(^4) = trace(p-A) for all A £ 0(H). (Thus, in light of 
Theorem 19.9, dD must not satisfy Property 4 of Definition 19.6.) 

6. In Exercises 6, 7, and 8, assume that each density matrix p is 
compact, so that p has an orthonormal basis {e^} of eigenvectors, for 
which the associated eigenvalues {Ay} are real and tend to zero as j 
tends to infinity. (Compare Theorem VI.16 in [34].) 

Show that a density matrix p is a pure state if and only if trace(p 2 ) = 1. 

7. (a) Show that each mixed state p is a nontrivial convex combination 

of other density matrices. 

(b) Show that a pure state cannot be expressed as a nontrivial convex 
combination of other density matrices. 

Hint: Show that the function /(A) := trace ^(Api + (1 — A)p 2 ) 2 ^ is a 
convex function of A. 
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8. For any density matrix p, show that the von Neumann entropy S(p) := 
trace(—plogp) is zero if and only if p is a pure state. 

9. Prove Lemma 19.14. 

Hint: First use the principle of uniform boundedness (Theorem A.40) 
to show that there exists a constant C with ||A„|| < C for all n. Then, if 
{fj} is an orthonormal basis for H 2 , decompose as the Hilbert 

space direct sum of the subspaces H i0fj, where each of these subspaces 
is isometrically identified with Hi in the obvious way. 

10. Suppose that ipi and ip 2 are two linearly independent elements of 
L 2 (K 3 ;C 2 ). Show that the function <f> in (19.10) is a nonzero element 
of L 2 (R 6 ;C 2 ®C 2 ). 


20 

The Path Integral Formulation 
of Quantum Mechanics 


We turn now to a topic that is important already for ordinary quantum 
mechanics and essential in quantum field theory: the so-called path inte¬ 
gral. In the setting of ordinary quantum mechanics (of the sort we have 
been considering in this book), the integrals in question are over spaces of 
“paths,” that is, maps of some interval [a, b] into R”. In the setting of quan¬ 
tum field theory, the integrals are integrals over spaces of “fields,” that is, 
maps of some region inside R d into K n . Formal integrals of this sort abound 
in the physics literature, and it is typically difficult to make rigorous math¬ 
ematical sense of them—although much effort has been expended in the 
attempt! In this chapter, we will develop a rigorous integral over spaces of 
paths by using the Wiener measure , resulting in the Feynman-Kac formula. 

We begin with the Trotter product formula, which will be our main tool 
in deriving the path integral formulas. From there we turn to the (heuristic) 
path integral formula of Feynman, and then to the rigorous version of 
Feynman’s result obtained by M. Kac, the so-called Feynman-Kac formula. 
Although it is not feasible to give complete proofs of all results presented 
here, we give enough proofs to get a flavor of the mathematics involved. 
We will prove a version of the Trotter product formula and, assuming the 
existence of the Wiener measure, a version of the Feynman-Kac formula. 


B.C. Hall, Quantum Theory for Mathematicians , Graduate Texts 
in Mathematics 267, DOI 10.1007/978-l-4614-7116-5_20, 
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20.1 Trotter Product Formula 


The Lie product formula (Point 7 of Theorem 16.15) says that for all X 
and Y in M n { C), we have 

e x+Y = lim (e x/m e y / m ) m . 

m—¥ oo 

The Trotter product formula asserts that a similar result holds for certain 
classes of unbounded operators on Hilbert spaces. 

Theorem 20.1 (Trotter Product Formula) Suppose that A and B are 

self-adjoint operators on H and that A+B is densely defined and essentially 
self-adjoint on Dom(A) nDom(B). Then the following results hold. 

1. For all if € H, we have 


lim 

N—> oo 


e it(A+B_ ( e itA/N e itB/N^N^ 


( 20 . 1 ) 


2. If A and B are bounded below, then for all if € H, we have 


lim 

N—too 


o~ t(A+B) 


if-(' 


, — tA/N—tB/N \N 


ri> 


( 20 . 2 ) 


In both results, the expression A + B refers to the unique self-adjoint ex¬ 
tension of the operator defined on Dom(A) D Dom(H). 

In the usual terminology of functional analysis, (20.1) asserts that the 
operators ( e ztA / N e ltB / N ^ N converge to e lt ^ A+B ^ in the “strong operator 
topology,” and similarly with (20.2). 

We will give a proof of this result in the special case in which A + B 
is densely defined and self-adjoint on Dom(A) D Dom(H). This condition 
holds, for example, whenever the Kato-Rellich theorem (Theorem 9.37) 
applies. See Sect. A.5 of [14] for a proof of the version stated above. 
Proof. Since all the operators in Point 1 of the theorem are unitary, it 
is easy to see that if the result holds on some dense subspace W of H, 
it holds on all of H. In Point 2 of the theorem, we first make a simple 
reduction to the case where A and B are non-negative, and then have the 
same conclusion, since all operators involved will then be contractions. 

We will prove Point 1 of the theorem, with the proof of Point 2 being sim¬ 
ilar. Let us introduce the notation S s := e ls ( A + B ) and T s := e lsA e lsB . What 
we want to prove is that for each if £ H, the quantity || (S t — (T t / N ) N )if || 
tends to zero as N tends to infinity. Now, a simple calculation shows that 


(S t - ( T t/N ) N )if 


N -1 


E C T t/N) j (S t /N - T t/N ){S t/N ) N ->~ l if 

3=0 


(20.3) 
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Since S. is a one-parameter unitary group, (S t / n) n j x 0 = S a ip, where 
s = (N — j — 1 )t/N. Thus, if we let 0 S = £ s 0, we have 

||(S t - (T t/N ) N H || < N sup \\(S t/N - T t/N )1 >,||. (20.4) 

0 <s<t 

Now, for any 0 in Dom(A + B), we have 

lim N(S t /Nip ~ 0) = it(A + B)ip, 

N—too ' 

by Stone’s theorem. Meanwhile, according to Exercise 2, we have 

lim — (T s — 7)0 = iAi/j + iBif, (20.5) 

s->0 s 

for all 0 £ Dom(A) n Dom(77). (This result is clear at the heuristic level.) 
Thus, 

lim N(S t /N ~ T t / N )ip = lim N(S t / N - 7)0 - lim N{T t/N - 7)0 

N—^oo N—yoo N—too 

= it{A + B)ip - it(A + B)ip = 0 (20.6) 

for every 0 £ Dom(A) n Dom(7?). 

Let W = Dom(A) D Dom(B), which is, by assumption, dense in H, 
equipped with the norm H-^ given by 

M^M + Ha + b)^. 

Since we are assuming A -f B is self-adjoint, and thus also closed, on W, 
we see that W is a Banach space with respect H-^ (Exercise 6 in Chap. 9). 
Now, the operators N(S t / N — T t / N ) are certainly bounded from W to H, 
for each N. Furthermore, (20.6) shows that for each 0 £ W, we have 


sup ||iV(S t/ jv - T t/JV )0|| < oo. 

N 

Thus, by the principle of uniform boundedness (Theorem A.40), there is a 
constant C such that 


\\N(S t/N -T t/N )^\\ <C10llr 

for all 0 £ W. It then follows (Exercise 3) that ||AT(S t /jv — Tt/jv)0|| tends 
to zero uniformly on every compact subset of W. 

Suppose, now, that for each 0 £ IT, the s i-T 0 S is continuous in W. If 
so, the image of the compact interval [0, t] under s H > 0 S will be compact 
in W, and so ||AT(S t /jv — T t / N )ijj s || will tend to zero uniformly in s. Thus, 
by (20.4), we will have Point 1 of the theorem. To establish the desired 
continuity, we first note that by Lemma 10.17, the operators S s = e ls l A + B ) 
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preserve Dom(A + B ), which is equal to W 1 by assumption. Then for any 
s, r £ [0, t\ and ^ £ W, we have 


0 is{A+B) 


e-v- - e ir( ~ A+B )ip 


e is(A+B_ e ir(A+B 


— ff,ia(A+B) _ p ir(A+B) 


— e 




(A + B){e isiA+B) ip - e ir(A+B) ip) 
( e is(A+ B ) _ e ir(A+B)^ A + B ^ , ( 20 .7) 


where we have used Lemma 10.17 again in the second equality. The strong 
continuity of e ls ( A+B '> (Proposition 10.14) then ensures that the right-hand 
side of (20.7) tends to zero as s approaches r. m 


20.2 Formal Derivation of the Feynman Path 
Integral 

In this section, we apply Point 1 of the Trotter product formula to the 
operator 


< 208 > 

Let us call the operators on the right-hand side of (20.8) A and B , re¬ 
spectively, and let us assume V is sufficiently nice that H is essentially 
self-adjoint on Dom(A) (~l Dom(13). Any bounded potential certainly has 
this property, as do many unbounded potentials. (See, e.g., Theorem 9.38.) 
Point 1 of Theorem 20.1 then tells us that 


e -uH/n^ 


lim exp 

N—too V 


ithA 1 

2^v/ eXP 


itV(X) 

Nh 


N 

h- 


Under mild assumptions on ij). Theorem 4.5 (extended to n dimensions) 
tells us that exp(ithA / (2mN)) may be computed as 



h( x i) dxi. 


Meanwhile, exp(—itV(X)/(Nh)) is simply a multiplication operator. 
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Thus, assuming that Theorem 4.5 applies at each stage, we have 

itV(X)Y N 


( ithA 'l 

eXP l2^vj eXP 


Nh j 




( x o) 


= C 


f exp h 
J K" l 


mN , 2 v 

xi-x 0 | > exp 


2th 

tyiN 

exp ^ l ^2th l XAr_1 _ XAr - 2 l 


itV(x i) \ 

Nh J 

j eXP {- Nh~ 

f (.mN 2 \ / itV(x N )\ 

J exp i —MTS 

X i />( xn ) dxjy dxjv_i • • • c?xi, 

where C = (mN/ (ith)) nN ^ 2 . Letting e = t/iV and assuming we can freely 
rearrange the order of integration, we obtain 


(e~ it6 ^)(x 0 ) 


= lim C 

N—too 


f N 




m 

X 

£ 

1 

2 

£ 


~V(xj- 1) 


x ip(x N ) dxx dx 2 • • • dx^r- 


(20.9) 


So far, the argument is mostly rigorous, coming from the Trotter product 
formula and Theorem 4.5. The nonrigorous part comes in attempting to 
evaluate the limit on the right-hand side of (20.9). Let us think of the 
values x 7 -, j = 0,..., N as constituting the values of a path x(s) at the 
points Sj := je = jt/N: 

X J = X tit/ N )- 

Since the distance between Sj-i and Sj is e, the quantity |xj — Xj_i|/e is 
an approximation to the derivative of x(s ) with respect to s. Meanwhile, 
the sum over j in the right-hand side of (20.9) is an approximation to an 
integral. Thus, if we then take the limit of the right-hand of (20.9) in a 
totally nonrigorous fashion, we obtain 


(e-^VXxo) 


= C 


r 

(i r 

m 

dx 

2 

, 1 

/ , ., exp < 

/paths with 
x(0)=x o 

KJo 

~2 

ds 

-V(x(s)) 

ds 


ip(x(t)) Vx. 

( 20 . 10 ) 


Here, C is a normalization constant and Vx is something like “Lebesgue 
measure” on the space of all paths x(-) mapping [0, t) into R”. (The quantity 
x in the expression Vx is a path , not a point in R n .) 
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The reader who is familiar with the Lagrangian approach to mechanics 
will recognize the expression in square brackets in the exponent on the 
right-hand side of (20.10) as the Lagrangian of the particle, L = T — V, 
where T = (1/2 )m |t>| is the kinetic energy and V is the potential energy. 
The integral of the Lagrangian over some time interval is called the action 
functional , denoted by the letter S. That is to say, given a path x(-), we 
define the action functional of x(-) over a time-interval [a, 6] as follows: 


5(x(-),a,6) 


r 

m 

dx 

la 

~2 

ds 


y(x(s)) 


ds. 


( 20 . 11 ) 


In Lagrangian mechanics, one shows that the solutions to Newton’s law are 
precisely the stationary points of the action functional. Using the notation 
in (20.11), we may rewrite (20.10) as 

(e-«*/»ty)(xo) = C f w . th expj^S(x(.),0,f)'L(x(f)) 2?x. (20.12) 

This formula is the Feynman path integral formula. 

Now, knowledge of Lagrangian mechanics is not directly relevant to the 
derivation of the Feynman path integral formula. Nevertheless, it is intrigu¬ 
ing that the an important quantity from classical mechanics should appear 
in the Feynman path integral formula in quantum mechanics. Indeed, this 
appearance raises the possibility that one can use the path integral formula 
to make connections between quantum mechanics and classical mechanics. 
Indeed, the “method of stationary phase” (when applied, formally, in an 
infinite-dimensional setting) asserts that for small values of h, the main 
contribution to the right-hand side of (20.12) comes from regions near the 
stationary points of the action functional, namely the classical trajectories. 
Using this method, Gutzwillcr was able to derive his famous trace formula, 
which provides predictions of typical eigenvalue spacings for Schrodinger 
operators based on the behavior of the underlying classical system. More 
information about this fascinating subject can be found in books on “quan¬ 
tum chaos,” including [19] by Gutzwiller himself. 

It is notoriously difficult to attach a rigorous meaning to the right-hand 
side of the Feynman path integral formula. Note that the formal expression 
“2?x” is the limit as N tends to infinity of the integral over (R”)^ in 
(20.9) with respect to the Lebesgue measure (i.e., the measure given by 
dxi dx 2 • ■ • dx^v). Thus, “2?x” should be something like Lebesgue measure 
on the space of all paths (maps from [0,f] into R”). However, it is known 
that an infinite-dimensional vector space (say, a Banach space) does not 
have any “reasonable” (say, cr-finite) translation-invariant measure that 
could play the role of Lebesgue measure. Furthermore, the absolute value 
of the constant C is easily seen to be infinite. Thus, we certainly cannot 
take the right-hand side of (20.12) literally. 






20.3 The Imaginary-Time Calculation 447 

A better approach is to avoid looking at the component parts of the 
Feynman path integral and instead to look at the whole expression against 
which the function ip(x.(t)) is being integrated. If we could attach a rigorous 
meaning to the expression 


| £>x, (20.13) 

as, say, a complex-valued measure on the space of continuous paths, then 
this could serve to give a meaning to the path integral. It is known, however, 
that there is no complex measure on the space of paths that makes the 
Feynman path integral formula true. The oscillatory behavior produced by 
the i in the exponent in (20.13) makes it difficult to give a rigorous meaning 
to the Feynman path integral in its original form. 


C exp | ^5(x(-),0,t) 


20.3 The Imaginary-Time Calculation 


In trying to give a rigorous meaning to the path integral formula of Feyn¬ 
man, Kac proceeded by considering the “imaginary time” time-evolution 
operator exp (—tH/h), which is just the usual time-evolution operator 
exp (—itH/h) evaluated with t replaced by —it. The idea is that if one 
can use path integrals to understand the operators exp (—tH/h), one can 
go back to the “real time” operator exp (—itH/H) by analytic continuation 
with respect to t. 

The counterpart of Theorem 4.5 for exp(— thA/(2m)) (proved in the 
same way) is 

(^•“/e-VXxo) = (^)” /2 £ e*P {~ I*. - x„| 2 } «(*.) 

Unlike Theorem 4.5, however, the above expression holds for all ip € T 2 (M n ), 
with absolute convergence of the integral for every x 0 £ M”. Applying the 
Trotter product formula and rearranging the integral as before gives 


(e"^/V)(xo) 


= lim C 

N—> oo 



f 1 N 



2 

) 

-s£* 

m 

~2 

Xj - Xj 1 

£ 

+ ^( x i-i) 

1 

{ 3=1 




1 


x ^(x^v) dxi dx 2 • • • dxjv. 


(20.14) 


If V is, say, bounded below, then there is no difficulty in changing the 
order of integration, because of the rapid decay of the integrand. Note that 
there is a relative sign change between the two terms in square brackets, 
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compared to (20.9). Taking a formal limit as before gives 


(e-‘^V)(x) 


= C 


f 

f 1 r 4 

m 

dx 

2 

1 

1 ., exp < 

/paths with 
x(0)=x o 

[ fi/o 

~2 

ds 

+ V(x(s)) 

ds 


(20.15) 


Note that the integral in the exponent on the right-hand side is not the 
classical action in (20.11), because the potential term has the wrong sign. 

Kac’s idea was to separate out the quadratic part of the exponent on the 
right-hand side of (20.15) and attempt to interpret the expression 


C exp 




Vx 


(20.16) 


as a measure on the space of paths. Specifically, this is a Gaussian measure, 
one with a (formal) density with respect to the Lebesgue measure that is 
the exponential of a quadratic expression. There is a well-developed the¬ 
ory of Gaussian measures on infinite-dimensional spaces. Although there 
is no Lebesgue measure in the infinite-dimensional case, one can construct 
Gaussian measures as limits of Gaussian measures on spaces of large finite 
dimension. 


20.4 The Wiener Measure 

Kac identified the formal expression in (20.16) as the Wiener measure. To 
be precise, for each fixed xo G R, there is a Wiener measure /i Xo , where /r Xo 
is supported on the set of paths x : [0, t] —> R with x(0) = xo- The Wiener 
measure was developed by Norbert Wiener as a rigorous embodiment of 
Albert Einstein’s mathematical model of Brownian motion. Einstein, in one 
of his 1905 papers, had proposed that the random motion of a very small 
particle in water was due to collisions between the particle and the water 
molecules. Einstein postulated that the increments of a Brownian path 
x [quantities of the form x(t) — x(s)] should be independent for disjoint 
time intervals and should be normal random variables with mean zero and 
variance proportional to t — s. The following theorem shows that there 
is a unique measure on the space of continuous paths satisfying Einstein’s 
criteria. Let C Xo ([0, t]; R") denote the space of continuous maps x(-) of [0, t] 
into R" satisfying x(0) = xo, equipped with the supremum norm. 

Theorem 20.2 (Wiener) For each vector Xo £ R" and each pair of pos¬ 
itive numbers a and t, there exists a unique measure on the Borel a- 
algebra in C Xo ([0, f]; R") such that the following condition holds. For each 
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sequence 0 = to < t\ < ■ ■ ■ < tisr < t of real numbers and each non-negative 
measurable function f on we have 



where 



Note that the right-hand side of (20.17) is extremely similar to the right- 
hand side of (20.14), except that there are no terms involving the potential 
V in the exponent in (20.17). Thus, it is reasonable to think that the Wiener 
measure is a rigorous version of the formal expression in (20.16). It should 
be noted, however, that the heuristic expression (20.16) is misleading in one 
important respect. That expression suggests that the measure is supported 
on paths x(-) for which dx/dt belongs to L 2 {\ 0, t]; M n ), since the exponential 
factor would seemingly “damp out” any paths for which this is not the case. 
This conclusion is, however, incorrect. [One should, in general, be extremely 
cautious in drawing conclusions based on purely formal expressions such as 
the one in (20.16).] Actually, the “typical” path with respect to the Wiener 
measure is nowhere differentiable; that is, the set of paths x(f) that are 
differentiable for even one value of t form a set of measure zero. 

This discrepancy is actually a general feature of Gaussian measures on 
infinite-dimensional spaces: They are always supported on a larger space 
than the formal expression would suggest. In the case of the Wiener mea¬ 
sure, the space on which the measure actually lives (the space of continuous 
functions) is nice enough that no difficulties arise in the formulation of our 
main result, the Feynman-Kac formula. In the setting of quantum field the¬ 
ory, however, issues concerning the support of a Gaussian measure become 
serious difficulties. See Sect. 20.6 for more information. 


20.5 The Feynman-Kac Formula 

The Wiener measure gives a rigorous interpretation to the expression in 
(20.16). Thus, the Wiener measure encapsulates everything in (20.15) ex¬ 
cept for the term involving V in the exponent and the factor of ^(x(t)). 
This reasoning accounts for the form of the following result. 
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Theorem 20.3 (Feynman Kac Formula) Suppose V : R 3 —> R can be 

expressed as the sum of a function in L 2 (R 3 ) and a bounded function. Then 
for all xq G R , we have 


(e- tA /V)(xo) 


exp 


i 


V(x(s)) ds>if(x(t)) dfx°(x), 


where is the Wiener measure on C Xo ([0,t];R 3 ) and where a = h/m. 

Of course, similar results hold in other dimensions, under suitable as¬ 
sumptions on the potential. We refer the interested reader to [37] or [14] 
for details on different versions of the Feynman-Kac formula. Theorem 20.3 
cannot be obtained directly from the Trotter product formula, because the 
limit in (20.14) is an L 2 limit rather than a pointwise limit. We will con¬ 
tent ourselves with proving an “integrated” version of the Feynman-Kac 
formula for nice potentials; Theorem 20.3 is Theorem 6.5 of [37]. 

Definition 20.4 Let C([0, f]; R ra ) denote the space of all continuous paths 
on [0,t] with values in R”. For all a > 0, let be the measure on 
C([0, t];R”) given by 

K E ) = [ & 0 { E ) dx 0 . 

J R” 

Proposition 20.5 Suppose V : f n —> R is bounded and continuous. Then 
for all <j), if G L 2 (R n ), we have 

= 1 «K x (0))exp(~ [ F(x(s)) ds\ if(x(t)) dn a {x), 

dc([o,t];R") l n do J 


where p a is as in Definition 20.4 an d where a = h/m. 

Proof. We begin with (20.14) and apply Theorem 20.2 with parameters 
chosen as follows. We take a = h/m , we take the sequence (tj) to be given 
by tj = jt/N, and we take / to be the function given by 


/(Xl,X 2 , . . • ,Xjv) = if (xn )■ 


Theorem 20.2 then allows us to express the right-hand side of (20.14) as 
an integral against the Wiener measure, giving 


(e-*^V)(x o) 


= lim 
N—too 


exp 



C X q ([0,t];R n ) 


if(x(t)) d^ o (x). 
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Since the limit in the above equation is an L 2 limit, we may move the 
inner product with <j> inside the limit on the right-hand side. The integral 
with respect to and the integral with respect to dx o may then be 
combined into a single integral with respect to p a , giving 

(</>, e~ tH/h ^) = lim f <M X (0)) 

N^-oo Jc([ 0 ,t]-,R n ) 

X6XP ( X (t)) j^ X ^ (20.18) 

Now, since V is continuous, 

(*(%))= l 

for every continuous path x. Furthermore, it is easily seen that the “distri¬ 
bution” of the quantity x(s) with respect to the measure is the Lebesgue 
measure on R", for any s G [0,t]. Thus, the function x i —> <^(x(0)) is 
square-integrable with respect to /i CT , with L 2 norm equal to the L 2 norm 
of (j> over M ra , and similarly for x i —> ^/>(x(t)). It follows that the quantity 
<j>(x(0))ip (x(t)) is an L 1 function on C([0,t];R ra ). Since V is bounded, we 
may apply dominated convergence to move the limit inside the integral, at 
which point we obtain the desired result. ■ 


20.6 Path Integrals in Quantum Field Theory 


In this section, we briefly discuss the path integral approach to quantum 
held theory. We consider quantum held theory in a space-time of dimension 
d , so that space has dimension d— 1. The configuration space for the classical 
version of the theory is the collection of “spatial” helds, that is, maps </>(x) 
of R d_1 into some hnite-dimensional vector space V. A path in the space 
of helds is then a map </>(x, t) of R d_1 x R = R d into V. In the path 
integral approach to quantum held theory (which is the most commonly 
used approach to the subject), one considers integrals over the space of 
such paths. 

Let us consider, as a simple example, what is called 0 4 theory. In this 
theory, the helds <f> map into R and we consider a path integral of the form 



x F{4>) T></>, 


IR d L 


Cl ||V0(x)H + c 2 </>(x) 2 + c 3 (^(x) 4 



(20.19) 


for some functional F((f>) on the space of helds. [The expression in (20.19) 
is, more precisely, a “Euclidean” or “imaginary time” path integral. Such 
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an integral is the counterpart in quantum field theory of the integral occur¬ 
ring in the Feynman-Kac formula in quantum mechanics.] In (20.19), J~,i 
represents the space of all “fields” (i.e., functions) mapping our space-time 
into R. In an attempt to make sense of this heuristic expression, we 
may follow the strategy we used in deriving the Feynman-Kac formula by 
separating out the quadratic part of the exponent. We look, then, for a 
measure p on F,i given by the heuristic expression 


dp[fi>) “=” Cexp 


s 


’s. d L 


Ci ||V(/>(x)|| 2 + C2^(x ) 2 dx \ V(j). 


( 20 . 20 ) 


Using the theory of Gaussian measures, one can construct a rigorously 
defined measure corresponding to the heuristic expression in (20.20). There 
is, however, a serious difficulty with this approach: The measure p is sup¬ 
ported on very “rough” fields, much rougher than the heuristic expression 
suggests. In fact, we have the following result. 

Proposition 20.6 For all d > 1, there exists a Gaussian measure on the 
space J-d of fields on R d corresponding to the heuristic expression (20.20). 
For d > 2, however, this measure is not supported on any space of ordinary 
functions, but rather on a space of distributions. 

We will not prove this result here; see Sect. 8.5 of [14] for more informa¬ 
tion. Here, then, is the problem with the path integral approach to quantum 
field theory on space-times of dimension d > 2: The functional f Rd fi(x) 4 dx 
does not make sense for a “typical” field with respect to the measure p in 
(20.20). As a result, we cannot make sense of (20.19) simply by absorbing 
all the Gaussian part into the definition of the measure p, since what is 
left over is not a ^-almost everywhere defined functional of fi. Indeed, even 
a local integral, of the form <f(x) 4 dx for some bounded region U in 
R d , fails to be almost-everywhere defined with respect to p. After all, if 
Jjj 4>(x) 4 dx made sense, then would be a locally L 4 function, rather than 
a distribution. 

It should be emphasized that the difficulty described in the previous 
paragraph is not just a technicality that can be swept away by some simple 
trick. Furthermore, this difficulty is not specific to fi 4 theory, but is present 
in all “nontrivial” field theories. In all interesting field theories, the fields 
defined by the Gaussian part of the path integral are fundamentally “too 
rough” to allow us to make sense of the non-Gaussian part of the integral. 
This phenomenon is the fundamental mathematical difficulty in the path 
integral approach to quantum field theory. 

To have a chance to make rigorous sense of path integrals in quantum 
field theory, one has to employ a complicated regularization process known 
as renormalization. This process has, so far, been carried out in a rigorous 
fashion only for a very small number of field theories. One of the Clay 
Millennium Prize problems is to make rigorous sense out of the Yang-Mills 
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field theory in four space-time dimensions. See [14] for a detailed survey 
of the mathematical issues connected with the path integral approach to 
quantum field theory. See also [13] for a treatment of quantum field theory 
and renormalization with a greater eye toward the physical content. 

Since the roughness of the fields is a major problem in trying to give a 
rigorous meaning to path integrals, let us think for moment why it arises. 
Suppose we wish to construct a Gaussian measure from a certain heuristic 
expression of the form /r = Ce~^^Vx, where Q is a positive-definite 
quadratic functional of x. A reasonable approach is to consider the (real) 
Hilbert space H for which \\x\\ 2 H = Q{x). [In the case of (20.20), H would 
be the “Sobolev space” of fields having one derivative in L 2 .] The heuristic 
expression for the Gaussian measure then takes the form 


dn(x) = Vx. 


( 20 . 21 ) 


One might now try to approximate /z by Gaussian measures /zjv on 
Hilbert spaces H^r of dimension N < oo. If dirnH < oo, then the expres¬ 
sion (20.21) is perfectly rigorous, where the constant C may be taken to 
normalize /z to be a probability measure. A simple calculation (Exercise 4), 
however, shows that for any R, we have 


lim ^n{B R jN ) = 0, 


where B R ,n denotes the ball of radius R in Hjv. This means that in the 
N —> oo limit, all of the “mass” of the measure is outside the ball of radius 
R , for every R. Thus, in the limit, the measure is supported entirely on 
points x where ||a;|| ff = oo, that is, on points that are not actually in H. 
The measures /z^r do converge to a measure /i as IV tends to infinity, but 
)jl does not live on H, but on some larger space filH. The original space 
H is a set of /z-measure zero inside B. See [16] for more information. In the 
case of the measure /r corresponding to the heuristic expression in (20.20), 
\i does not—as the expression suggests— live on the Sobolev space of fields 
with one derivative in L 2 , but on a larger space, which turns out to be a 
space of distributions. 

20.7 Exercises 

1. Verify the identity (20.3) in the proof of the Trotter product formula. 

2. Verify (20.5) in the proof of the Trotter product formula, using Stone’s 
theorem and the following identity: 


~(e isA e isB 

s 


I)i/> = e isA (iBiP)+e lsA 

+ -{e isA -I)i>. 
s 



.isB 


I)ip — iBip 
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3. Suppose {Aat} is a family of bounded operators mapping a Banach 
space W\ to a Banach space W‘i- Suppose that for some constant C, 
we have ||Ajv|| < C for all N. Finally, suppose that ||AjvV’|| —I 0 as 
N —> oo, for every ip £ W. 

(a) Show that for each and each e > 0, there exists a neigh¬ 

borhood U of ip and an integer M such that 

II^WII < e 

for all (f> e U and N > M. 

(b) If K is a compact subset of W, show that ||Ajv^|| tends to zero 
uniformly for if G K. 

4. (a) Let H^r be an iV-dimensional Hilbert space. Show that the mea¬ 

sure 

dp N (x) := Tr- N ' 2 e-\W\ 2 dx 

is a probability measure. Here dx is the Lebesgue measure on 
Hjv, normalized to that the unit cube has volume 1. 

Hint: Use Proposition A.22. 

(b) Let Br.n denote the ball of radius R in H^v- Show that for each 
R < oo, there exists number an < 1 such that 

Hn(Br,n) < (,aR) N . 

Thus, limjv-xx) = 0. 

Hint: The ball Bh,n is contained in a cube centered at the origin 
with side length 2 R. 


21 

Hamiltonian Mechanics on Manifolds 


In this chapter, we generalize the Hamiltonian approach to mechanics (in¬ 
troduced already in the Euclidean case in Sect. 2.5) to general manifolds. 
The chapter assumes familiarity with the basic notions of smooth mani¬ 
folds, including tangent and cotangent spaces, vector fields, and differen¬ 
tial forms. These notions are reviewed very briefly in Sect. 21.1, mainly in 
the interest of fixing the notation. See, for example, Chap. 2 of [40] for a 
concise treatment of manifolds and [29] for a detailed account. Throughout 
the chapter, we will use the summation convention , that repeated indices 
are always summed on. 


21.1 Calculus on Manifolds 

Throughout this section, M will denote a smooth, n-dimensional manifold. 


21.1.1 Tangent Spaces, Vector Fields, and Flows 

For each x € M, we have the tangent space to M at x, denoted T X M. Given 
a smooth coordinate system x \,..., x n on M, the vectors 


d d 
dx\ 1 ’ dx n 


( 21 . 1 ) 


form a basis for the tangent space at each point. A vector field X on M 
is map assigning to each point x £ M an element X x of T X M. A vector 
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field X is smooth if the coefficients of X in a basis of the form (21.1) are 
smooth functions, for every smooth coordinate system. As in Exercise 14 
in Chap. 2, we think of a vector field as a first-order differential operator 
satisfying the Leibniz rule: 

X(fg) = X(f)g + fX(g). 


Given a smooth vector field X on M and a point x € M, there exists a 
curve 7 X : (a, b) —> M such that 7 ^( 0 ) = x and 


dj x 

dt 


= X, 


lx (t)• 


Any two such curves agree on the intersection of their intervals of definition. 
There is a largest interval (a™ ax , 6 ™ ax ) on which such a curve can be defined. 
If, for each x G M, we have a™ ax = —oo and 6 ™ ax = + 00 , we say that the 
vector field X is complete. If M is compact, then each smooth vector field 
on M is complete. We may assemble the curves 7 x into the flow <f> generated 
by X , defined as 

$t(z) = 7 x(t), 

whenever a™ ax < t < 6 ™ ax . If t does not belong to (a™ ax , 6 ™ ax ), then $ t (a;) 
is not defined. The flow $ satisfies 


$o(a;)=a;. ( 21 . 2 ) 

Furthermore, if x is in the domain of <E>t and $ t (:r) is in the domain of d>„, 
then x is in the domain of <f> s+t and 

$.($*(*)) = $»+*(*). (21.3) 

In the other direction, given a family of maps satisfying (21.2) and 
(21.3) and appropriate domain properties, there is a unique vector field X 
such that $ is the flow generated by X. In particular, if $ t (x) is defined 
for all x and t, is smooth as a map oftf xl into M, and satisfies (21.2) 
and (21.3), there is a unique complete vector field X such that $ is the 
flow generated by X. 


21.1.2 Differential Forms 

For each x, the tangent space T X M is an n-dimensional real vector space. 
The dual vector space to T X M is the cotangent space to M at x , denoted 
T*M. Given a smooth function f on M and a point x £ M, the differential 
of / at x is the element of TfM given by 


df(X) = X(f) 
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for each X £ T x f. In particular, in any local coordinate system xi,... ,x n , 
the elements dx i,..., dx n satisfy 



Thus, the elements dx i,..., dx n form a basis for T*M at each point. For 
any smooth function /, we have 

df = ^ dx , (21.4) 

A k-form a on M is a mapping assigning to each point x £ M a fc-linear, 
alternating functional a x on T X M. A fc-form is smooth if a(Xi, ..., A^) is a 
smooth function on M for each fc-tuple of smooth vector fields X\. ..., X k 
on M. In particular, if / is a smooth function, then df is a smooth 1-form. 
If a is a smooth fc-form and X a smooth vector field, we may define the 
contraction of a with A, which is the (fc — l)-form ixcx. given by 

(ixa)( X u ...,X k _ 1 ) = a(X,X 1 ,...,X k _ 1 ). 

Given a fc-linear form (iona vector space V, define the antisymmetriza- 
tion AS(4>) of by 

AS(</>)(ui ,... ,Vk) = ^2 si g n ( CT )^(Mi)’M2)>--->M*o)’ 

where S k denotes the permutation group on fc elements. Given a fc-form a 
and an Z-form (3 on M , let a (g> (3 be the (fc + Z(-linear form on each T X M 
given by 

(a®p)(X lt ..., X k+l ) = a{X u ..., X k )p(X k+1 ,..., X k+l ). 

Then let a A (3 denote the (fc + Z)-form given by 

a A {3 = AS(a <8> (3)- 

In particular, if a and [3 are 1-forms, then a A (3 is the 2-form given by 

(a A P)(X, Y) = a(X)P(Y) - a(Y)P( A). 

In a smooth coordinate system x \,..., x n , a smooth fc-form a can be ex¬ 
pressed uniquely as 

« = «./:..u (A) dx oi A • • • A dx jk . 

A 2-form oj on M is said to be nondegenerate if uj defines a nondegenerate 
bilinear form on each T X M. More explicitly, this means that for each x £ M 
and each nonzero A £ T X M, there exists a Y £ T X M such that 


w(A,y)^o. 
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Suppose a is a smooth fc-form on M and S' is a compact, oriented, fc- 
dimensional submanifold-with-boundary of M. Then one can define the 
integral of a over M. There is a map d , called the exterior derivative , 
mapping smooth fc-forms to smooth (fc + l)-forms and having the property 
that 



(21.5) 


for every compact, oriented, fc-dimensional submanifold-with-boundary S 
of M and every (fc— l)-form /3 on M. Here dS is the boundary of S, with the 
natural orientation induced by the orientation on M. The relation (21.5) is 
known as Stoke’s theorem. A k- form a is said to be closed if da = 0. 

The exterior derivative may be computed in coordinates by the formula 


d(f dxj x A • • • A dxj k ) 


d[_ 

dxi 


dxi A dxj x A • • • A dxj k 


A coordinate-invariant formula for the exterior derivative of a fc-form a is: 


fc+i 


da( Ad,..., A fe+ i) = ^(-l^+VAA,..., Xj ,.. 

j= 1 

■ • j-Xfc+i) 

+ Y / (-^) j+l <x j ,x l ],x 1 ,, 

• • • ? Xj , •.., .Xfc-|-i), 


3<l 


where Xj indicates that the Xj term is omitted and where [Xj, X{\ is the 
commutator of Xj and Xi as first-order differential operators. In particular, 
if a is a 1-form, we have 


{da)(X, Y) = X(a(Y)) - Y (a(X)) - a([A, Y]). (21.6) 

A key identity satished by the exterior derivative is 

d(da) = 0 

for all fc-forms a. Conversely, if (3 is a closed (fc + l)-form (i.e., d/3 = 0), then 
(3 can be expressed locally in the form /3 = da for some fc-form a. More 
precisely, if /3 is closed, then for any x £ M there exists a neighborhood U of 
x and a fc-form a defined on U such that /3 = da on U. If M satisfies certain 
topological conditions, then each closed fc-form a on M can be expressed 
globally in the form a = dfd. In particular, if M is simply connected, then 
each closed 1-form /3 can be expressed globally in the form /3 = df for some 
smooth function (i.e., 0-form) /. 

If X is a vector field and a is a fc-form, we may define the Lie derivative 
of a in the direction of A", denoted Cxa , as follows: 

C x a= j t m){a) , 
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where <f> t is the flow generated by X and (d>()(a) is the pullback of a by 
d> t . The Lie derivative may be computed using the formula 

Cx = ix ° d + d o ix . (21.7) 


21.2 Mechanics on Symplectic Manifolds 

The reader is warned that sign conventions in the subject of Hamiltonian 
mechanics are not consistent from author to author. 

21.2.1 Symplectic Manifolds 

A symplectic manifold is, roughly, a manifold with enough additional struc¬ 
ture to allow one to define the Poisson bracket of two functions. 

Definition 21.1 A symplectic manifold is a smooth manifold N to¬ 
gether with a closed, nondegenerate 2-form ui on N. If{N\,ui\) and (-/V 2 , wf) 
are symplectic manifolds, a map $ : N\ —> N 2 is a symplectomorphism 
if <f> is a diffeomorphism and in addition 

d>*(w 2 ) = UJ\. 

It is not hard to see that every symplectic manifold must be even dimen¬ 
sional, for the simple reason that an odd-dimensional vector space does not 
admit a nondegenerate, skew-symmetric bilinear form. 

Throughout this chapter, N will always denote a symplectic manifold of 
dimension 2 n with symplectic form ui. 

We now show that the cotangent bundle of any manifold has the struc¬ 
ture of a symplectic manifold in a canonical way. Suppose x\,...,x n is 
a coordinate system defined on an open set U C M. Then at each point 
x £ U, an element <t> of TfM can be expressed uniquely in the form 

<t> = Pj dxj 

for a sequence p\,... ,p n of real numbers. The quantities Xi,... ,x n and 
Pi,... ,p n constitute a coordinate system on 7 t _ 1 (£/). We refer to a coordi¬ 
nate system of this sort as a standard coordinate system on T*M. 

Example 21.2 For any smooth manifold M, define a 1-form 9 on the 
cotangent bundle T*M by 

0(x) (X i0) = 4 >mx)) 

for each tangent vector X £ T^ X ^(T*M), where 7 r : T*M —» M is the 
canonical projection. Then the 2-form ui := dO is closed and nondegenerate. 
We refer to 6 and u as the canonical 1-form and the canonical 2-form on 
T*M, respectively. 
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Proof. Using a coordinate system {xj} on X and the associated stan¬ 
dard coordinate system {xj , pj } on T*M , the projection ir is given by 
7r (x,p) = x. Meanwhile, a tangent vector X to T*M is expressible as a 
linear combination the d/dxj’s and d/dpj’s. Thus, 



What this means is that 


6 = Pj dxj , 


where the Xj's are now viewed as functions on T*M rather than on M. We 
have, then, 


oj = dO = dpj A dxj. 


It is now easy to see that oj is nondegenerate (Exercise 1). ■ 

21.2.2 Poisson Brackets and Hamiltonian Vector Fields 

If oj is nondegenerate, then it gives a canonical identification of T Z N with 
T*N at each point, by identifying a vector X in T Z N with the linear func¬ 
tional oj(X 7 •) in T*N. We can then transfer the bilinear form oj from T Z N 
to T*N by means of this identification. We denote the resulting bilinear 
form on T*N by w -1 . 

Definition 21.3 If f and g are smooth functions on N, define the Pois¬ 
son bracket {/, g} of f and g by 


{f,9} = -u ± (df , dg). 


In particular, if 1 denotes the constant function on N 7 then {1,/} = 
{/, 1 } = 0 for all smooth functions /. 

Example 21.4 If oj is the canonical 2-form on T*M 7 then the associated 
Poisson bracket may be computed in standard coordinates as 


if = 

’ ^ dxj dpj dpj dxj 


for all smooth functions f and g on T*M. 


Proof. The linear functional 



has a value of —1 on the vector d/dpj and a value of 0 on all the other 
basic partial derivatives. This means that oj(d/dxj, •) = — dpj. Similarly, 
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uj(d/dpj , •) = dxj. We may thus compute, for example, that 

d _d_\ 

dxj ’ dpj ) 

= ui~ 1 (—dpj , dxj ) 

= ui~ 1 (dxj,dpj). 

Meanwhile, oj-^fdxj , dxk) = to~ 1 (dpj,dpk) = 0 and to~ 1 (dpj,dxk) = 0 
when j ^ k. Thus, we compute that 

, , _i ( df , df , dg , dg 

= {te- J dx ‘ + dF l dPi ’d^ dXt + dF« 

dfdg dfdg 
dxj dpk :ik dpj dxk J,c ’ 

which reduces to the claimed expression. ■ 

Proposition 21.5 For any smooth functions f,g,h on N, we have 

{di /} = ~{f,g} 

and 

{/) 9h} = {/, g}h + g{f, h}. 

Proof. Since oj is skew-symmetric on the tangent space to N at each point 
and is obtained from ui by means of an isomorphism of tangent and 
cotangent space, a; _1 is a skew-symmetric form on the cotangent space. The 
skew-symmetry of the Poisson bracket follows. The second relation follows 
from the Leibniz product rule for d(gh) together with the bilinearity of 
w _1 . ■ 

Definition 21.6 If f is a smooth function on N, let Xf be the unique 
vector field on N such that 




df = co( X f ,.). (21.8) 

We call Xf the Hamiltonian vector field associated to f. 

That is to say, Xf corresponds to df under the isomorphism between 
tangent and cotangent spaces established by ui. 

Proposition 21.7 For all f and g , 

X f (g) = {fg} = -X g (f). 

Furthermore, 

u>{X f ,X g ) = -{f,g}. 
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Proof. For each z £ N, we are using ui to identify T Z N with T*N. Equa¬ 
tion (21.8) says that under this identification, Xf is identified with df. 
Thus, 

-w-\df,dg) = -w(. X f ,X g ) = -df(X g ) = -X g (f). 

Thus, {f,g} = —X g {f ), as claimed. A similar argument with the roles of 
/ and g reversed gives the claimed relationship between Xf(g) and {g, /}. 
Finally, 

W (A7,X g ) = df(X g ) = X g (f) = —{f,g}, 

as claimed. ■ 

Definition 21.8 For any smooth function f on N, the Hamiltonian 
flow generated by /, denoted , is the flow generated by the vector field 
~ X f- 

In the case N = T*R" = R 2n , this definition agrees with our notation in 
Sect. 2.5. 

Proposition 21.9 For any smooth function f on N, the Hamiltonian flow 
preserves ui. 

Proof. In general, a flow $ preserves a differential form a if and only if 
the Lie derivative i^a = 0, where X is the vector field generating <F. In 
our case, since oj is closed, we have, by (21.7), 

C Xf w = d[i Xf w] = d 2 f = 0, 

since ix f w is, by the definition of A/, equal to df. ■ 

Proposition 21.10 For any smooth functions f,g,h on N , the Jacobi 
identity holds: 


{/, { g, h}} + {g , {h, /}} + {h, {/, g}} = 0. 

This result shows that the space of smooth function on N forms a Lie 
algebra under the Poisson bracket. The proof of Proposition 21.10 relies on 
Proposition 21.9, which in turn relies on the fact that w is closed. 

Proof. Since the Hamiltonian flow preserves cu, it also preserves w _1 
and thus 

w _1 (d(ff o $f), d(h o ${)) = w _1 (dg, dh) o 
or, equivalently, 

{g o h. o $/} = {g, h} o 


Differentiating this relation with respect to t at t = 0 gives 
i~ x f(9), h} + {g, —Xf(h}} = —X f ({g, h}), 
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or, equivalently, 


-{{/, g},h} + {g, {/, h}} = -{/, {g, h}}. 

After moving —{/, {g, h}} to the left-hand side of the equation and using 
the skew-symmetry of the Poisson bracket, we obtain the Jacobi identity. 


Proposition 21.11 For any smooth functions f and g on N, the Hamil¬ 
tonian vector fields X f and X g satisfy 

[Xf,X g ]=X ag} . 


Proof. See Exercise 3. ■ 

21.2.3 Hamiltonian Flows and Conserved Quantities 

We have seen (Proposition 21.9) that if / is a smooth function, then the 
flow generated by Xf preserves u>. We have the following partial converse 
to this result. 

Proposition 21.12 Suppose <f> is the flow generated by a vector field —X 
on N. If <f> preserves uj then X can be represented locally in the form X = 
Xf for some smooth function f on N. If N is simply connected, the function 
f exists globally on N. 

Proof. The statement that $ preserves oj can be expressed infinitesi¬ 
mally as 

Cxuj = 0 . 

Since also w is closed, (21.7) tells us that 


d(ixw) = 0 . 


Since ixw is closed, this 1-form can be expressed locally as ixui = df for 
some smooth function /, which says precisely that X = Xf. If AT is simply 
connected, then every closed 1-form can be expressed globally as df , for 
some smooth function /. ■ 

A flow of the sort in Proposition 21.12 is said to be locally Hamiltonian. 
Such a flow is said to be (globally) Hamiltonian if the function / in the 
proposition can be defined on all of N. (Compare Definition 21.8.) If $ is a 
Hamiltonian flow, the function / such that d> = is called a Hamiltonian 
generator of <f>. If ./V is connected, then any two Hamiltonian generators of 
must differ by a constant. 

To see that, in general, / is only defined locally, consider the symplectic 
manifold S' 1 x R, with symplectic form lo = dfi A dx, where <j> is the angular 
coordinate on S 1 and x is the linear coordinate on M. Note that the 1-form 
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d(j) is independent of the choice of a local angle variable on S' 1 , since any two 
such angle functions differ by a constant (an integer multiple of 2w). Thus, 
dcj) is a globally defined, smooth 1-form, even though there is no globally 
defined, smooth angle function 4>. Define a flow $ by 

x) = (<t>,x + t). 

This flow certainly preserves us, since dx is invariant under translations. 
The flow $ is generated by the vector field — X = d/dx , and 

us(—d/dx, ■) = d(j). 

As we have already noted, however, there is no globally defined function <js 
whose differential is d<j>. 

Although any smooth function on a symplectic manifold N generates a 
Hamiltonian flow, in physical examples there is usually one distinguished 
function with a Hamiltonian flow that is thought of as “the” time-evolution 
of the system. 

Definition 21.13 A Hamiltonian system is a symplectic manifold N 
together with a distinguished Hamiltonian flow generated by smooth 
function H on N, called the Hamiltonian of the system. A function 
f is called a conserved quantity for a Hamiltonian system (N, ) if 

/($^(x)) is independent oft for each fixed x € N. 

As in the K 2n case, conserved quantities are useful in understanding the 
nature of the dynamics. See the discussion following Corollary 2.26. 

Proposition 21.14 For any Hamiltonian system we have 

for all z € N, or, more concisely, 

In particular, a smooth function f on N is a conserved quantity for a 
Hamiltonian system & H if and only if {/, H} = 0. 

Proof. For the flow generated by any vector field X , we have 

= X 9tlz) f. 

If X = —Xf, then by Proposition 21.7, we have the claimed result. ■ 

Proposition 21.15 A smooth function f is a conserved quantity for a 
Hamiltonian system (N, & H ) if and only if H is invariant under the Hamil¬ 
tonian flow generated by f. 
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Proof. By the previous proposition, H is invariant under the flow generated 
by / if and only if { H , /} = 0, which holds if and only if {/, H} = 0, which 
holds if and only if / is a conserved quantity. ■ 

21.2.4 The Liouville Form 

A symplectic manifold N has a natural volume form, which allows us to 
formulate an analog on N of Liouville’s theorem (Theorem 2.27). 

Definition 21.16 If N is a 2n-dimensional symplectic manifold, the 
Liouville form on N is the 2n-form A given by 


where u 71 = u> A • • • A ui. 


Since u is, by assumption, a nondegenerate form on each tangent space 
T Z N, it is not hard to check that A is a nonvanishing (2n)-linear form on 
each T~N. Thus, A determines an orientation on N. Given a compactly 
supported continuous function / on N, we can define the integral of / 
over N, computed with respect to the orientation determined by A itself. 
Using the version of the Riesz representation theorem for locally compact 
topological spaces, one can show that there is a unique measure , called 
the Liouville volume measure, for which the integral of every continuous 
compactly supported function / is given by J N f A. 

We are now ready to state the general form of Liouville’s theorem. 

Theorem 21.17 (Liouville’s Theorem) For any smooth function f on 
N, the Hamiltonian flow <fU preserves A. 

Proof. The flow will preserve A if and only if the vector field Xf satisfies 
CxjX = 0. But 

Cx f A = —.[(CxfOj) A u> A • • • A w 
n\ 

+ wA (Cx f w) AwA-'-Aw 
H-+wA---AwA (£x f co)\. 

Since we have already shown (Proposition 21.9) that £x f 0 J = 0, we see 
that £x f A = 0. ■ 


21.3 Exercises 

1. Show that the canonical 2-form ui on T*M is nondegenerate. 
Hint: Work in standard coordinates {xj,pj}. 
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2. Show that if $ : M —> M is a diffeomorphism, then the induced map 
<f>* : T*M —> T*M is a symplectomorphism. 

3. Using Proposition 21.7 and the Jacobi identity for the Poisson bracket, 
verify that 


[Xf,X g ]=X {Lg} 

for all smooth functions / and g on N. 

4. If N is compact, show that 



for all smooth function / and g on N. 

Hint: Apply Liouville’s theorem to the flow 4>/. 


22 

Geometric Quantization on Euclidean 
Space 


22.1 Introduction 

In this chapter, we consider the geometric quantization program in the 
setting of the symplectic manifold R 2rl , with the canonical 2-form ui = 
dpj A dxj. We begin with the “prequantum” Hilbert space L 2 (R 2 ") and 
define “prequantum” operators Q pr e(/)- These operators satisfy 

Qpre({f,g}) = ;rt[<2pre(/),Qpre(sO] 

for all f and g. Nevertheless, there are several undesirable aspects to the 
prequantization map that make it physically unreasonable to interpret it 
as “quantization.” To obtain the quantum Hilbert space, we reduce the 
number of variables from 2 n to n. Depending on how we do this reduction, 
we will obtain either the position Hilbert space, the momentum Hilbert 
space, or the Segal-Bargmann space. Each of these subspaces is preserved 
by the prequantized position and momentum operators, and by certain 
other operators of the form Q pre (/). 

Although the material in this chapter is a special case of what we do in 
Chap. 23, doing this case first allows us to get a feeling for the methods and 
results of geometric quantization quickly, without needing to develop the 
full machinery of line bundles, connections, and polarizations over general 
symplectic manifolds. In any case, we would need to carry out most of the 
calculations in this chapter eventually, as standard examples of the general 
theory. 
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Although this chapter does not require the full machinery of symplectic 
manifolds, we will make use of the notions of 1-forms and 2-forms on R 2ra , 
along with the notion of the differential of a 1-form. In particular, the 
expression (21.6) for the differential of a 1-form will be used. 

The reader should be warned that sign conventions in geometric quan¬ 
tization are not consistent from author to author. The sign conventions 
used here are chosen to maintain consistency with the physics literature. 
In particular, we could eliminate an annoying minus sign in the definition 
of the holomorphic subspace if we were willing to allow the function pj to 
quantize to ih d/dxj. Since, however, the convention Pj — —ih d/dxj is 
universal in the physics literature, we have chosen to be consistent with 
that convention and to accept some slightly inconvenient sign choices else¬ 
where. We continue to follow the summation convention, in which repeated 
indices are always summed on. 


22.2 Prequantization 

Ideally, a quantization procedure Q, mapping functions on a symplectic 
manifold N to operators on some Hilbert space H, should satisfy the 
following properties. First, Q(f) should be self-adjoint whenever / is real 
valued. Second, we should have Q(l) = /, where 1 is the constant function. 
Third, Q({f, g}) should be equal to [Q(f), Q(g)]/(ih)- Fourth, there should 
be some sort of “smallness” assumption. In the case N = K 2n , for exam¬ 
ple, we may require that H should be irreducible under the action of the 
(exponentiated) position and momentum operators. (See Definition 14.6.) 
Although Groenewold’s theorem (Theorem 13.13) suggests that it is unre¬ 
alistic to expect to find a quantization procedure that satisfies all of these 
properties exactly, we try to come as close as possible. 

Throughout this chapter, we follow the convention of thinking of a “vec¬ 
tor field” on M. N as a first-order differential operator, as in Exercise 14 in 
Chap. 2. Given, for example, the vector-valued function 


X = (2:ei + x 2 ,x\x 2 ) 

on R 2 , we identify X with the operator of “differentiation in the direction 
of X” that is, with the following first-order differential operator: 

d d 

X = (2xi +x 2 )~ - Vxix 2 -—. 

OX i OX 2 

In particular, given a smooth function / on R 2n , the Hamiltonian vector 
field Xf associated to / is thought of as a differential operator: 

dj^_d_ _ df d 

dxj dpj dpj dxj ’ 


*/ = {/,'} 


( 22 . 1 ) 
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acting on C 00 ^ 2 "). (Compare Proposition 21.7.) By Proposition 21.11, the 
commutator (as differential operators) of two Hamiltonian vector fields Xf 
and X g is X{f, g }. Thus, the operators ihXf satisfy the desired commutation 
relations: 

[ihXf,ihX g \ = (ih) 2 X {f , g} = (ih){ihX {ftg y). 

It is tempting, then, to define a (pre)quantization map simply by tak¬ 
ing Q(f) = ihX f , viewed as a self-adjoint operator on the Hilbert space 
L 2 (R 2ra ). This map, however, does not satisfy Q( 1) = I. If we to correct 
our definition to Q(f) = iHXf + /, where / means the operator of mul¬ 
tiplication by /, then Q(l) = I but the desired commutation property is 
destroyed. 

It is possible to achieve both Q(l) = I and the desired commutation 
relations by adding one more term as follows. If u = dp g A dxj is the 
canonical 2-form on R 2n , let 8 be any symplectic potential for w, that is, 
any one-form with 

dd = to. (22.2) 

(We may, e.g., take 9 = pydxj.) For a smooth function / on R 2n , define an 
operator Q pre (f), acting on C 00 ^ 2 "), by 

Q pre (/) = ih (x f - T 0(Xf)^j + /. (22.3) 

The expression / on the right-hand side of (22.3) means, more precisely, 
the operator of multiplication by /, and similarly for the function 9(Xf). 
Note that since 9 is a 1-form and Xf is a vector field, 9(Xf) is a function on 
R 2 ’ 1 . The operator Q pre (f) is the prequantization of / and is to be viewed 
as an unbounded operator on L 2 (R 2n ), where we refer to L 2 (R 2rl ) as the 
prequantum Hilbert space. 

According to Exercise 1 , any divergence free vector held on W. N is a skew- 
symmetric operator on C'“(R' iV ) C L 2 ( R w ). Meanwhile, each Hamiltonian 
vector held is divergence free, as we have already remarked in the proof 
of Liouville’s theorem (Theorem 2.27). Thus, for any smooth, real-valued 
function / on R 2n , the operator Q pre (f) is at least symmetric. It can be 
shown that if Xf is complete, meaning that the associated Hamiltonian how 
is dehned for all times, then Q pre (f ) is actually self-adjoint on a natural 
domain. (See the discussion following the proof of Proposition 23.13.) 

As it turns out, the 9(Xf) term in (22.3) is precisely what is needed to 
restore the desired commutation relations, while still allowing Q pre (l) to 
equal the identity. 

Proposition 22.1 For all f,g £ ^“(R 2 "), we have 

^[Qpre(/),Q pre (ff)] — Qpre({f, g})i 

where the identity is to be understood as an equality of operators on C°° 
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Before proving this result, it is useful to understand the behavior of the 
expression Xf — ( i/h)0{Xf) occurring in the definition of (3 pre (/). 

Definition 22.2 For any symplectic potential 6 and vector field X on R 2 ™, 
let Vy denote the covariant derivative operator, acting on C'°°(K 2rt ), 
given by 

S7 x =X- 1 -6{X). (22.4) 

h 

Note that our prequantized operators can be written as 

Qpre(/) =ihX Xf +/• 

Proposition 22.3 For any symplectic potential 9, let Vx denote the 
associated covariant derivative in (22. 4). Then for all smooth vector fields 
X and Y on R 2n , we have 


[Vx, Vy] = V[X,X] - ^oj(X, Y). (22.5) 

In particular, if X = Xf and Y = X g , we have 

[Vx,,Vx 9 ] =Vx {/;9} +^{f,g}- 

According to standard differential geometric definitions, the 2-form u>/h 
on the right-hand side of (22.5) is the curvature of the covariant derivative 
V. For our purposes, the fact that [Vx / , VxJ in not simply Vx {/ B} is an 
advantage. The extra term in the formula for the commutator is just what 
we need to compensate for the failure of the operators ihXf + f to have 
the desired commutation relations. 

Proof. Using the easily verified identity [Vx, /] = X(f), we obtain 
[Vx, Vy] - V \x,Y] =- l j_[X(e(Y)) - Y(9(X)) - 0([X,Y])]. 

In light of (21.6), the right-hand side becomes —( i/h)(dO)(X,Y ), where 
d6 = w. ■ 

We may now easily prove Proposition 22.1. 

Proof of Proposition 22.1. Using Proposition 22.3, we obtain 

U) [^Vx, + /, ihVx g + g] 

= w ( v A- { /, s > + +Xf(g) - x g {f) 

= ^Vx {/]9} - {/, g} + {/, g} + {/, g}, 


which reduces to what we want. ■ 
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Example 22.4 If 9 = Pjdxj, the prequantized position and momentum 
operators are given by 

d 

Qprei.Xj') — 4“ — 

d 

Qpre {]Pj ) = 

These operators are essentially self-adjoint on C£°(IR 2n ) and their 
self-adjoint extensions satisfy the exponentiated commutation relations of 
Definition If. 2. 

Proof. We compute that X Xj = d/dpj and that 9(X Xj ) = 0, giving the 
indicated expression for Q pre {xj)- Meanwhile, X Pj = —d/dxj and 9(X Pj ) = 
~Pj. There is a cancellation of the 9(X Pj ) term in the definition of Q P re{Pj) 
with the pj term, leaving Q pie (Pj) — ihX Pj . 

The essential self-adjointness of the operators follows from Proposition 
9.40. To verify the exponentiated commutation relations, we calculate the 
associated one-parameter unitary groups as 

( e «Qpr.(* i ty)( Xj p ) = e itx ^(x, p - thej) 

( e *fQpre(Pi)^ ) )( X; p) = ^/j( x + thej, p), (22.6) 

where we now let Q pTe (xj) and Q pre (.Pj) denote the unique self-adjoint 
extensions of the given operators on Cf° (M 2ra ). (Compare Proposition 13.5.) 
The exponentiated commutation relations can now be easily verified by 
direct calculation. ■ 

As we have presented things so far, the concept of covariant derivative, 
and thus also of prequantization, depends on the choice of symplectic po¬ 
tential 9. This dependence is, however, illusory; we will now show that the 
prequantum maps obtained with two different symplectic potentials are 
unitarily equivalent. 

Proposition 22.5 Suppose that 9\ and $2 are two different symplectic po¬ 
tentials for the canonical 2-form u, so that d(9 1 —9 2 ) = 0. Let the associated 
covariant derivatives be denoted by V 1 and V 2 . Choose a real-valued func¬ 
tion 7 so that dy = 9 1 — 9 2 and let U~, be the unitary map of L 2 (M. 2n ) to 
itself given by 

= e-**/V 

Then for every vector field X , we have 

UjXxUf 1 = V|. (22.7) 

IfQ 3 pie (f), j = 1,2, are the associated prequantization maps, it follows that 

U^QUifWf 1 = Qpre(/)' (22.8) 

The map U 1 is called a gauge transformation. 
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Proof. The operation of multiplication by 0 1 (X) commutes with 
multiplication by e -l7 / R , whereas 

X(e i7/ V) = e il/h X4> + -f e i7/fi X( 7 )V>. 

Since Xfy) = (cfy)(X) = 6 1 {X ) — 6 2 {X ), we obtain 

V_V(e i7 /V) = (x + jrXfr) - ^(*/)) V- 

= e i7 / R (x- b 

= e^V^. 

Multiplying both sides of this equality by e _I7 / R gives (22.7). Equation 
( 22 . 8 ) follows by observing that multiplication by / commutes with multi¬ 
plication by e -l7 / R . ■ 


22.3 Problems with Prequantization 

Given the naturalness of the prequantization construction, it is tempting 
to think that prequantization could actually be considered as quantization. 
Why not take our Hilbert space to be L 2 (R 2 ™) and the quantized operators 
to be Q P re(/)? To answer this question, we now examine some undesirable 
properties of prequantization. 

In the first place, the Hilbert space L 2 (M. 2n ) is very far from irreducible 
under the action of the quantized position and momentum operators, in 
contrast to the ordinary Schrodinger Hilbert space L 2 (R rl ), which is irre¬ 
ducible, by Proposition 14.7. Indeed, in Sect. 22.4, we will construct a large 
family of invariant subspaces. (See Proposition 22.13.) 

In the second place, the prequantization map is very far from being mul¬ 
tiplicative. Of course, since quantum operators do not commute, we cannot 
expect any quantization scheme Q to satisfy Q(fg) = Q{f)Q{g) for all f 
and g. Nevertheless, the standard quantization schemes we have considered 
in Chap. 13 do satisfy this relation for certain classes of observables / and 
g. In the Weyl quantization, for example, we have multiplicativity if / and 
g are both functions of x only, independent of p (or functions of p, inde¬ 
pendent of x). For the prequantization map, however, we almost never have 
multiplicativity, for the simple reason that Q P re(fg) is a first-order differ¬ 
ential operator, whereas Q P ie{f)Qpie{g ) is second-order, provided there is 
at least one point where Xf and X g are both nonzero. 

In the third place, the prequantization map badly fails to map positive 
functions to positive operators. Although most of the quantization schemes 
in Chap. 13 do not always map positive functions to positive operators, they 
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somehow come close to doing so. Indeed, Qweyi, Qwick, and Q a nti-Wick 
all map the harmonic oscillator Hamiltonian to a non-negative operator, 
since a*a + (1/2)/, a*a, and aa* are all non-negative. (See Exercise 4 in 
Chap. 13.) By contrast, the prequantized harmonic oscillator Hamiltonian 
has spectrum that is unbounded below, as we now demonstrate. 

Proposition 22.6 Consider a harmonic oscillator Hamiltonian of the 
form 



Then for each integer n, the number nhu is an eigenvalue for Q pre (H). 

Note that n in the proposition is allowed to be negative, so that the 
spectrum of Q pie (H) is not even bounded below. On the other hand, in 
Sect. 22.5, we will consider a certain closed subspace H a of the prequantum 
Hilbert space, which is one candidate for the quantum Hilbert space. For 
appropriate choice of a , the space H Q is invariant under Q pre (H) and the 
restriction of Q pre (H) is self-adjoint with spectrum nhu), where n ranges 
over the non-negative integers. See Proposition 22.14. And finally, when 
we introduce half-forms in Sect. 23.7, we will finally restore the spectrum 
(n + 1/2 )hu), where n ranges over the non-negative integers, that we found 
in Chap. 11. 

Proof. We can write H as 



where y = mux. The flow associated to this Hamiltonian consists of rota¬ 
tions in the (y,p)-plane. If we choose our symplectic potential to be 


9 = - (v dx — x dv) = - (p dy — y dp) 



then the 9{Xh) term in Q pre (H) cancels with the H term, leaving 


Q P re(H ) = lhX H 



Now, if cf> denotes the angular variable for polar coordinates in the ( y,p)~ 
plane, then y d/dp — p d/dy is just d/dcf. Thus, we can find eigenvectors 
for Q pre (H) of the form 


V>nM) = /(r)e-^ 
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where n is an integer and / is an arbitrary function with |/(r)| 2 r dr< oo. 

■ 

The conclusion of the matter is that it is not physically reasonable to 
use prequantization as our quantization scheme. Instead, we will pass to 
a “smaller” Hilbert space on which the position and momentum operators 
act irreducibly. 


22.4 Quantization 

To obtain a Hilbert space that can be thought of as giving us a “quanti¬ 
zation” (as opposed to a prequantization) of R 2rl , we restrict ourselves to 
a subspace of the prequantum Hilbert space. The idea is that we should 
be using only half of the variables on R 2n . We might, for example, restrict 
ourselves to functions that depend only on the position variables and are 
independent of the momentum variables. Now, the space of functions if that 
are, say, independent of p in the ordinary sense (i.e., ^>(x, p) = ^(x, p')) 
is not invariant under gauge transformations (the maps U 1 in Proposi¬ 
tion 22.5). The gauge-invariant notion of being independent of p is that 
the covariant derivatives of if should be zero in the p-directions. Similarly, 
we may consider spaces of functions with covariant derivatives that are are 
zero in some other set of n directions. 

Definition 22.7 Fix a symplectic potential 9. Define the position sub¬ 
space as the subspace ofC°°(] R 2n ) consisting of functions if for which 

V d/d Pj i> = 0 

for all j. Similarly, define the momentum subspace as the subspace ofC°° 
(K 2rl ) consisting of functions if for which 

V d/dxj = 0 

for all j. Finally, define the holomorphic subspace with parameter a to 
be the subspace of C°° (K 2ra ) consisting of functions if for which 

V d/dzjif = 0 

for all j, where Zj = Xj — iapj and where d/dzj and d/dZj are defined by 
d 1 ( d i d \ d 1 

dzj 2 \ dxj a dpj J 1 dzj 2 

The operators d/dzj and d/dzj are nothing but the usual complex deriva¬ 
tive operators on C n written in terms of the variables x and p, where we 
identify R 2rl with C" by the map (x, p) x — iap. 

Of course, the exact form of the various subspaces in Definition 22.7 
depends on the choice of symplectic potential. It is convenient to use the 
symplectic potential 9 = pj dxj . 


dxj 


a d Pj ^ ’ ( 22 ' 9 ^ 
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Proposition 22.8 Take the symplectic potential 9 = pj dxj. Then the 
position, momentum, and holomorphic subspaces may be computed as fol¬ 
lows. The position subspace consists of smooth functions if on M 2rl of the 
form 

where (j> is an arbitrary smooth function on R". The momentum subspace 
consists of smooth functions if of the form 

^(x,p) = e IX 'P/V(p), (22.10) 

where (f is an arbitrary smooth function on R n . Finally, the holomorphic 
subspace consists of functions of the form 

if(x, p) = F(z u z n )e- a ^^ 2h \ (22.11) 


where F is an arbitrary holomorphic function on C n and where Zj = Xj — 

iapj. 

Proof. Since 9(d/dpj) = 0, we have Vg/g p . = d/dpj, so that functions 
that are covariantly constant in the p-directions are actually constant in 
the p-directions. Meanwhile, 9(d/dxj) = Pj and so 


^ d/dxj 


d 

dxj 


i 


iPr 


Now, any function if on K 2 " can be written in the form e IX ' p / ri ^(x, p) for 
some other function <f. If we use this form to compute there is a 

convenient cancellation, giving 

(V a/ax ^)(x,p)=e ix ' p / R ^. 

Thus, Vg/dxjif = 0 for all j if and only if (f is independent of x. 

Finally, we note that 9(d/dzj) = Pj/2 , so that 




d i 

dzj ' 


Any function if on R 2ra can be written in the form if(x, p) = e “IpI ~/( 2h )F 
for some other function F, where we note that 


p-“IpI'' 


= exp 


_ z j) 2 /( 8ah ) 


Thus, 

-a|pl7(2ft) 

dzj 


fl _£lp-a|p|7(2ft) _ J_„. p -«|pl7(2fi) 

4 ah 2 H Pj 
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When we compute V g/g?.ip using the indicated form, there is another 
convenient cancellation, giving 

(Vs/a^Xx, p)= e -“lpl 2 /( 2R )||. 

Thus, V d/dzjii = 0 for all j if and only if F is holomorphic as a function 
of the variables Zj = Xj — iapj. ■ 

From the physical standpoint, we do not merely want a vector space of 
functions, but a Hilbert space. It is natural, then, to look at functions of the 
forms computed in Proposition 22.8 that belong to L 2 (R 2 "). In the case of 
the position and momentum subspaces, we encounter a serious problem: 
There are no nonzero functions of the indicated form that are square inte¬ 
grate over R 2n . After all, if ip is in the position subspace, then p) is 
independent of p and the integral of \ip\ 2 over the p-variables will be infi¬ 
nite, unless ip is zero almost everywhere. If ip is in the momentum subspace, 
\ip\ 2 is independent of x and we have a similar problem. 

The solution to this problem is to integrate not over R 2 ” but over R". 
Although the “proper” way to make this change of integration is to intro¬ 
duce the notion of “half-forms,” as in Chap. 23, we will content ourselves 
in this chapter with the following simplistic rule: integrate only over the 
variables on which \ip\ 2 depends. If we want to get a Hilbert space (not just 
an inner product space), we must also allow functions of the specified form 
that are square integrable but not necessarily smooth. We may therefore 
identify the position Hilbert space and momentum Hilbert space as follows. 

Conclusion 22.9 The position Hilbert space is the space of functions on 
R 2n of the form 

V>(x, p) = </>(x), 

where <p € L 2 (R"). The norm of such a function is computed as 

Hf = [ l<?K x )l 2 dx - 

JR n 

The momentum Hilbert space is the space of functions on M 2n of the form 

^(x, p) = e ix ' p /V(p), 

where <p € L 2 (R"). The norm of such a function is computed as 

H\\ 2 = [ \0(P)\ 2 dp. 

J R” 

If we consider the holomorphic subspace, we find that it behaves better 
than the position and momentum subspaces, in that there exist nonzero 
functions of the form ( 22 . 11 ) that are square integrable over R 2ra , as we 
will see shortly. Furthermore, the space of functions of the form (22.11) 
that are square integrable over R 2rl form a closed subspace of L 2 { R 2rl ), by 
the same argument as in the proof of Proposition 14.15. 
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Conclusion 22.10 The holomorphic Hilbert space consists of those 
functions if of the form (22.11) that are square integrable over K 2n . If if 
is identified with the holomorphic function F in ( 22.11 ), then this Hilbert 
space may be identified with HL 2 ( C n ,i/), where 

v(z) = e -|Imz|2/( “ R) . 

The space 'HL 2 ( <C n ,i') is nothing but an invariant form of the Segal- 
Bargmann space (Definition 14.14), where here “invariant” means that 
the density v is invariant under translations in the real directions. This 
space can be identified unitarily with the ordinary Segal Bargmann space 
HL 2 (C n , p. 2 ah) as follows. Define a map : 'HL 2 (C n , p. 2 ah) —t HL 2 
(£7» by 

^(F)(z) = (2Trah)- n/2 e- z2/i4ah) F(z), (22.12) 

where z 2 = z\ H- + z 2 . Then a simple calculation shows that 

II’ J, (- f1 )IIz,2 ( C'\D = J c \F{z)\ 2 p 2a h(z) dz. 

Since also e~ z2 /( 4 “ R ) is holomorphic as a function of z, we see that T maps 
TLL 2 (C n 7 p, 2 ah) isometrically into TLL 2 (C n ,u). The map 'k has an inverse 
given by multiplication by {2-Kah) n ^ 2 e z "/( 4 “ R ), showing that 4' is actually 
unitary. In particular, there exist many nonzero holomorphic functions on 
C n that belong to UL 2 {C n , u). 

We will regard any of the Hilbert spaces in Conclusions 22.9 and 22.10 
as our quantum Hilbert space. These spaces are to be compared to the pre¬ 
quantum Hilbert space L 2 (R 2n ), which is in some sense “bigger,” consisting 
of functions of twice as many variables. Note there are multiple possibili¬ 
ties for the quantum Hilbert space. To reduce from the prequantum Hilbert 
space to the quantum Hilbert space, we have to choose a set of n variables, 
and then we look a functions that depend only on those n variables. In¬ 
deed, there are many other possibilities for the quantum Hilbert space; we 
have considered only the most common choices. We defer a discussion of 
the general theory until Chap. 23. 

The reader may wonder why we are using the definition Zj = Xj — iapj 
(a > 0) rather than Zj = Xj+iapj. If we repeated the preceding calculations 
with Zj = Xj + iapj , with a corresponding sign change in the definition of 
d/dzj : we would find that if satisfies X?g/dzf*f for all j if and only if if is 
of the form 

if(-x,p)=F(z 1 ,...,z n )e a M 2 K 2H \ (22.13) 

where F is holomorphic on C". The change in sign in the exponent between 
(22.11) and (22.13) has a drastic effect: There are no nonzero holomorphic 
functions F for which the function if in (22.13) is square integrable over 
R 2ra . (See Exercise 3.) Unlike the situation with the position and momentum 


478 


22. Geometric Quantization on Euclidean Space 


Hilbert spaces, there is no natural way to alter the domain of integration 
to make a function of the form (22.13) have finite norm. 

We see, then, that there is a big difference between the definitions Zj = 
Xj — iapj and Zj = x :j + iapj. In the general framework of geometric 
quantization, we will have a similar distinction, where complex structures 
satisfying a certain positivity condition behave well, whereas the “opposite” 
complex structures behave badly. (See Definition 23.19 in Sect. 23.4.) 


22.5 Quantization of Observables 

Now that we have constructed our quantum (as opposed to prequantum) 
Hilbert spaces, we need to construct operators on these spaces. According 
to the standard geometric quantization program, the quantum operator 
associated with a function / is supposed to be simply the restriction to the 
quantum Hilbert space of the prequantum operator Q pie (f), provided that 
Qpre(f) leaves the quantum Hilbert space invariant. 

Proposition 22.11 The position, momentum, and holomorphic subspaces 
in Definition 22.7 are all invariant under the prequantum operators Q pre (x.,) 
and Q pre (Pj)- Specifically, in the position subspace, we have 

Qpre(Xj)(j)(x) = Xj(j)(x) 

QpieiPj)^^) Qx ’ 

in the momentum subspace, we have 

Q pre (^)(e Ix p /Q(p)) = e ix - p / R (^(P)) 
Q pre fe)(e IX P/Q( p)) = e ix 'P/ ft (p^(p)), 
and in the holomorphic subspace, we have 

Q pre (x,)(F( z )e-“l p l 2 /( 2fi >) = (ah^+ Zj F( z)) e ~ a ^ 2 /^ 
Q pre fe)(F(z)e- Q l p l 2 /(^)) = e-“l p l 2 /( 2R ). 

Proof. See Exercise 4. ■ 

The invariance of the three subspaces under the prequantized position 
and momentum operators follows from a general result in geometric quanti¬ 
zation, that for a real-valued function /, the prequantum operator Q pre (/) 
preserves a given quantum space if and only if the Hamiltonian flow gen¬ 
erated by / preserves the polarization defining the quantum space. The 
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term “polarization” refers to the set of directions in which the elements of 
the quantum space are covariantly constant. In the case of the position, 
momentum, and holomorphic spaces, the set of such directions is the same 
at every point, which means that the polarization is invariant under trans¬ 
lations. But the Hamiltonian flows generated by Xj and pj are nothing 
but translations in the —pj-directions and the Xj-directions, respectively. 
Of course, in this simple example, we can verify the invariance by direct 
computation, which also gives the indicated form of the operators on each 
subspace. 

Note also that in each case, the “preferred” functions act simply as mul¬ 
tiplication operators. In the position subspace, for example, the position 
operator Q pre {xj ) acts simply as multiplication by Xj. whereas in the mo¬ 
mentum subspace, the operator Q pie (Pj) acts as multiplication by pj. Fi¬ 
nally, in the holomorphic subspace, the operator 

Qpre(zj) (f( z)e-“l p l 2 /( 2R )) = ( Zj F( z))e-“l p l 2 /( 2R \ 

where Zj = Xj — iapj, since the terms involving dF/dzj cancel. 

We now focus on the position Hilbert space and look for operators of the 
form Qp le (f) that leave the position subspace invariant. 

Proposition 22.12 The position subspace is invariant under Q pre (f) when¬ 
ever f is of the form 


/(x, p) = a(x) + bj{x)pj (22.14) 


for some smooth functions a and bi,... ,b n on R n . On the other hand, the 
position subspace in not invariant under the operator Q pre (p 2 ). 

Proof. If / is of the form (22.14), calculation shows that 9(Xf) + f = a(x). 
If we drop any terms in Xf involving dfdpp since these are zero on the 
position subspace, we end up with 

Qpre(/)(<Kx)) = -ihhjfx.)^- + a(x)0(x), (22.15) 


which is again in the position subspace. [There is no p-dependence in the 
coefficient of d/dxj in (22.15) because df /dpj is independent of p.[ On 
the other hand, direct calculation shows that the restriction to the position 
subspace of Q pre (f) is 


- 2ihpi e^ s - p ’' 


which does not preserve the space of functions on R 2n that are independent 

of p. ■ 


480 


22. Geometric Quantization on Euclidean Space 


It should be noted that the expression on the right-hand side of (22.15) 
is not a self-adjoint, or even symmetric, operator on L 2 (R"), unless the 
vector field b(x) happens to be divergence free. (Even though the vector 
held Xf is divergence free on K 2n , the way Xf acts on functions that are 
independent of p is not necessarily a divergence free vector held on K".) 
This undesirable feature of our quantization scheme is the result of our 
simplistic method of passing from L 2 (M. 2n ) to L 2 (M. n ) in our derivation of 
Conclusion 22.9. When we do this reduction properly, using half-forms, we 
will obtain a self-adjoint operator. See Sect. 23.6. 

We now consider the behavior of the holomorphic subspace under the 
prequantized position and momentum operators. 

Proposition 22.13 For any a > 0, let H Q be the subspace of L 2 (R 2n ) 
consisting of smooth functions ip that satisfy Xg/gg-ip = 0, where d/d Zj 
is as in (22.9). Then H Q is a closed subspace of L^(R 2n ) and H a is in¬ 
variant under the one-parameter unitary groups generated by Q pre {xj ) and 
Qpre(pj). Furthermore, Q pre (xj ) and Q P re(Pj) act irreducibly on H a in the 
sense of Definition 1/.6. 

For each a > 0, the holomorphic Hilbert space is a subspace of the 
prequantum Hilbert space invariant under the exponentiated position and 
momentum operators. Thus, the prequantum Hilbert space is far from being 
irreducible under the action of those operators. 

Proof. The invariance of H a is a simple calculation (Exercise 5). 
Irreducibility can be established by reducing to the previously established 
irreducibility of the Segal-Bargmann space under the operators T a in The¬ 
orem 14.16. To this end, we should check that the unitary map T in (22.12) 
intertwines products of exponentials of Q pie (xj) and Q pre (Pj) with opera¬ 
tors of the form T a (with h replaced by 2ah). This is a straightforward but 
tedious calculation, and we omit the details. ■ 

We conclude this section with an example of a quantum subspace that is 
invariant under the (pre)quantized Hamiltonian of a harmonic oscillator. 

Proposition 22.14 Consider a harmonic oscillator with Hamiltonian 

H = — 1 — (p 2 + (mvjx) 2 ) . 

2 TO 

Consider also the subspace H a in Proposition 22.13, with a = 1 /(tow). 
Then the operator Q pie (H ) leaves H a invariant. Furthermore, the restric¬ 
tion of Q pre (H) to H q has non-negative spectrum consisting of eigenvalues 
of the form nhu, where n ranges over the non-negative integers. 

Proposition 22.14 is a much more physically reasonable result for the 
spectrum of the quantization of the non-negative function H than on the 
full prequantum Hilbert space, where (Proposition 22.6) the spectrum of 
Q pre {H) is not even bounded below. When we introduce the “half-form 
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correction” in Sect. 23.7, we will finally be able to obtain the “correct” 
spectrum for the quantum harmonic oscillator, consisting of numbers of 
the form (n + 1/2 )fux, n = 0,1,2,.... See Example 23.53. 

Proof. As in the proof of Proposition 22.6, we introduce the variable 
y = mixx. With a = l/(mw), this gives z = (y — ip)/(moo). We use the 
symplectic potential 


0 = 


1 

2 


(p dx — x dp) 


1 

2 mix 


(p dy-y dp). 


Then 



1 

2 


p-\—x 
a 


i 


2a 


z 


and so V g/g s = d/dz + z/(2ah). From this, we can easily check that the 
holomorphic subspace consists of functions of the form 


F(z)e' l = |2/(2aR) = F(z) exp 


(: y 2 +P 2 ) \ 
2 muih J 


(22.16) 


where F is holomorphic. 

Meanwhile, as in the proof of Proposition 22.6, we have 

( d d 
Q pre (H) = ihix[y—-p— 


which is just an angular derivative in the (y,p)-plane. Since the exponential 
factor in (22.16) is rotationally invariant, Q pie {H) only hits F. Meanwhile, 




F 


y-ip \ 
mix J 


dF / i \ dF 1 
^ dz \ mix J ^ dz mix 


i 

mix 


(y - ip) 


dF 

dz 


dF 


dz 


Thus, 

Q pre (fO(F(z)e- |2|2/(2 “ fi) ) = e -M 2 /( 2 «fc) ; 

which is again in the holomorphic subspace. 

Finally, as in Proposition 14.15, the functions z n : n = 0,1,2,..., form 
an orthogonal basis for the Hilbert space H Q . Each monomial z n is an 
eigenvector for the operator z d/dz with eigenvalue n. This establishes the 
claim about the spectrum of the restriction to H a of Q pie (H). m 

The operator F i—>• fuxz dF/dz is self-adjoint on the holomorphic Hilbert 
space, in contrast to the operators in (22.15) in the case of the position 
Hilbert space. Indeed, self-adjointness is “automatic” in this case, because 
the holomorphic Hilbert space is actually a subspace of the prequantum 
Hilbert space, and the restriction of a self-adjoint operator to an invariant 
subspace is self-adjoint. 
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22.6 Exercises 

1. Consider the vector field 

A ' := “ jW g| 

on R 2rl , where the a 3 ’s are smooth, real-valued functions. Show that 
X is skew-self-adjoint on C c 00 (l ,v ) if and only if the divergence of X 
(i.e., the quantity dcij/dxj) is identically zero. 

2. Using the symplectic potential 6 = p dx , compute Q pie (xp 2 ). Show 
that Q pre {xp 2 ) is not in the algebra of operators generated by Q pie (x) 
and Q P re(p)- 

Hint : Consider how Q pre {xp 2 ) acts on functions that are independent 
of p. 

3. (a) Suppose F is a holomorphic function on C such that 

f |.F(z )| 2 dz < oo, 

J C 

where here dz denotes the 2-dimensional Lebesgue measure on 
C = K 2 . Show that F is identically zero. 

Hint : If F is not identically zero, use a power series argument 
to show that the L 2 norm of F over a disk of radius R tends to 
infinity as R tends to infinity. 

(b) Show that if a function of the form (22.13), with F holomorphic 
on C", is square integrable, then F must be identically zero. 

4. Prove Proposition 22 . 11 , using the explicit form of Q pre {xj) and 
QpreiPj) i n Example 22.4. 

Hint: In the case of the holomorphic subspace, express the operators 
d/dxj and d/dpj in terms of the operators d/dzj and d/dzj in (22.9). 

5. Show that the space of functions of the form in (22.11), where F is 
holomorphic on C ra , is invariant under the operators e lt Qp^( x i) anc j 
gjtQprefe) computed in ( 22 . 6 ), for all £ € M and j = 1 , 2 ,..., n. 
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Geometric Quantization on Manifolds 


23.1 Introduction 

Geometric quantization is a type of quantization , which is a general term 
for a procedure that associates a quantum system with a given classical 
system. In practical terms, if one is trying to deduce what sort of quantum 
system should model a given physical phenomenon, one often begins by 
observing the classical limit of the system. Electromagnetic radiation, for 
example, is describable on a macroscopic scale by Maxwell’s equations. On 
a finer scale, quantum effects (photons) become important. How should one 
determine the correct quantum theory of electromagnetism? It seems that 
the only reasonable way to proceed is to “quantize” Maxwell’s equations— 
and then to compare the resulting quantum system to experiment. 

Meanwhile, not every physically interesting system has JR 2 ' 1 as its phase 
space. Geometric quantization, then, is an attempt to construct a quantum 
Hilbert space, together with appropriate operators, starting from a phys¬ 
ical system having an arbitrary 2ro-dimensional symplectic manifold N as 
its phase space. To perform geometric quantization on N, one must first 
choose a polarization, that is, roughly, a choice of n directions on N in which 
the wave functions will be constant. If IV = T*M , then one may use the 
“vertical polarization,” in which the wave functions are constant along the 
fibers of T*M. For cotangent bundles with the vertical polarization, geo¬ 
metric quantization reproduces the “half-density quantization” of Blattner 
[4]. (See Examples 23.45 and 23.48.) Even for cotangent bundles, however, 
it is of interest to use polarizations other than the vertical polarization, as 
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we have seen already in the R ra case. In the case of the cotangent bundle of 
a compact Lie group, for example, the paper [20] shows how quantization 
with a complex polarization gives rise to a generalized Segal- Bargmann 
transform. 

Some phase spaces, meanwhile, may not even be in the form of a cotan¬ 
gent bundle. In the orbit method in representation theory, for example, 
the relevant symplectic manifolds are “coadjoint orbits,” which typically 
are not cotangent bundles. [In the SU(2) case, for instance, these orbits are 
2-spheres with the natural rotationally invariant symplectic form.] In quan¬ 
tum field theory, meanwhile, one encounters Lagrangians that are linear, 
rather than quadratic, in the “velocity” variables. In such cases, the initial 
velocity is determined by the initial position, and one cannot think of the 
space of initial conditions as a (co)tangent bundle. Systems of this form can 
still be symplectic, but they are not cotangent bundles. Furthermore, it is 
common to think of compact symplectic manifolds (such as S 2 with a ro¬ 
tationally invariant symplectic form) as classical models of internal degrees 
of freedom, such as spin. 

To quantize these more general symplectic manifolds, one needs a more 
general approach to quantization. Given a symplectic manifold (IV, to) sat¬ 
isfying a certain integrality condition, one can construct a line bundle L 
over N along with a connection V on L which has a curvature of co/h. 
One can then define “prequantum” operators, acting on sections of L , in 
much the same way we did in the Euclidean case in Chap. 22, and these 
operators will have the desired relationship between Poisson brackets and 
commutators. One then chooses a polarization on N and defines the quan¬ 
tum Hilbert space to be the space of sections that are covariantly constant 
in the directions of that polarization. If the Hamiltonian flow generated by 
a function / preserves the relevant polarization, then Q pre (/) will preserve 
the quantum Hilbert space. In the case of real polarizations, there may fail 
to be any nonzero square-integrable sections that are covariantly constant 
in the directions of the polarization, a possibility that forces us to introduce 
the machinery of “half-forms.” 

Let us end this introduction with a brief critique of the framework of geo¬ 
metric quantization. In the first place, geometric quantization has too many 
definitions (bundles, connections, curvature, polarizations, half-forms) and 
too few theorems. In the second place, the class of functions that geometric 
quantization allows us to quantize—those functions for which the associ¬ 
ated Hamiltonian flow preserves the polarization—is often dishearteningly 
small. In the case N = T*M , for example, with the natural “vertical” 
polarization, geometric quantization does not allow us to quantize the ki¬ 
netic energy function, at least not by the “standard procedure” of geomet¬ 
ric quantization. Nevertheless, geometric quantization is the only game in 
town if one wants to quantize general symplectic manifolds in a way that 
produces an actual Hilbert space and operators thereon. 
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This chapter lays out in an orderly fashion all the ingredients needed 
to “do” geometric quantization. Furthermore, although this approach in¬ 
creases length, the chapter fills in the details of several arguments that 
are only sketched in the standard reference on the subject, the book [45] of 
Woodhouse. The presentation assumes basic results about symplectic man¬ 
ifolds from Chap. 21. Besides the basic results about manifolds reviewed in 
Sect. 21.1, we will make use of the Frobenius theorem (see, e.g., Chap. 19 
of [29]). 

As we have noted already in the introduction to Chap. 22, sign con¬ 
ventions in the subject of geometric quantization are not consistent from 
author to author. 


23.2 Line Bundles and Connections 

In this section, we develop the necessary machinery to extend the prequan¬ 
tization construction of Sect. 22.2 to arbitrary symplectic manifolds. We 
introduce the notion of a line bundle over a manifold and sections thereof, 
which look locally like complex-valued functions. We then introduce the 
notion of covariant derivatives of sections of a line bundle, where locally 
these covariant derivatives take the form Vx = X — i6(X) for a certain 
1-form 9. We then introduce the curvature 2-form, which is a globally de¬ 
fined, closed 2-form that can be computed locally as d9. We continue to 
observe the summation convention, in which repeated indices are always 
summed on. 

Definition 23.1 If X is a smooth manifold, a complex line bundle over 
X is a smooth manifold L together with the following additional structures. 
First, we have a smooth, surjective map tt : L —> X. Second, for each x € X, 
the set 7r _1 ({x}) is equipped with the structure of a complex vector space of 
dimension 1. For each x G N, the vector space 7r —1 ({a:}) is called the fiber 
of L over x. 

These structures are assumed to satisfy the local triviality property, 
namely that each x G X has a neighborhood U such that there exists a 
diffeomorphism x '■ 7 t _ 1 (I 7 ) ->[/xC with the following properties. First, 

t r(p) = 7Ti (x(p)), 

where 7 Ti : U x C —> U is projection onto the first factor. Second, for each 
x G U, the map p H > tt 2 (x(p)) a vector space isomorphism of n~ 1 ({x}) 
with C. 

A section of a line bundle L over X is a map s : X —x L such that 
7 j(s(p)) = p for all p £ X. 

For any manifold X, we can form the trivial line bundle X x C, where 
n(x, z) = z and where the vector space structure on {x} x C is just the 
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usual vector space structure on C. The local triviality property for a general 
line bundle L means that L “looks” locally like the trivial line bundle. 

Definition 23.2 A connection V on a line bundle L over N is a map 
associating to each vector field X on N and section s of L another sec¬ 
tion V J v(s) of L satisfying the following properties. First, for each smooth 
function f on N, we have 


V fx (s) = fVx(s) (23.1) 

for all vector fields X and sections s. Second, for each smooth function f 
on N , we have the product rule 

V x (fs) = (X(f))s + fV x (s) (23.2) 

for all vector fields X and sections s. 

Note that for any section s of L and any function / on N, the quantity 
fs is a section of s. Given a connection V and a vector field X, the operator 
V y is called the covariant derivative in the direction of X. 

Definition 23.3 A Hermitian structure on a line bundle L over N is 
a choice of an inner product (•, •) on each fiber 7r _1 ({:c}) of L such that 
for each smooth section s of L, (s, s) is a smooth function on N. A line 
bundle L together with a choice of a Hermitian structure on L will be called 
a Hermitian line bundle. A connection V on a Hermitian line bundle 
L is called Hermitian if for every vector field on X, we have 

(Va-(si), s 2 ) + (si, V x (s 2 )) = X( Sl ,s 2 ) (23.3) 

for all smooth sections si and s 2 of L. 

We will let the expression “Hermitian line bundle with connection” refer 
to a Hermitian line bundle L together with a Hermitian connection on L; 
that is, in this expression, “Hermitian” applies both to the bundle and to 
the connection. 

Given a Hermitian line bundle L with connection, it is always possible 
to choose a locally defined smooth section So near any point such that 
(so,so) = 1. We call so a local isometric trivialization of L. Any section 
s of L can be written locally as s = /so for a unique complex-valued 
function /. Given a vector field A", let 0(X) be the unique function such 
that 

V.y(so) = -i9(X)s 0 . 

Using the assumption V/x = fVx, it can be shown (Exercise 1) that the 
value of 6(X) at a point p depends only on the value of X at p. Thus, 0 
defines a 1-form on N. Using the assumption that V is Hermitian, it can 
be shown (Exercise 2) that 0(X) is always real valued. 
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Now, using the product rule (23.2) for covariant derivatives, we have 

Vx(/s 0 ) = X(f)so + /Vx(s 0 ) 

= (X(f) - iO(X)f)s 0 . 

Thus, if we identify sections of L locally with the coefficient function /, we 
have 

X x (f)=X(f)-i9(X)f, (23.4) 

as in Sect. 22.2. We call 9 the connection 1-form associated to the particular 
local isometric trivialization. 

Definition 23.4 For any Hermitian line bundle (T,V) with connection, 
define the curvature 2-form lo of V by requiring that 

u(X,Y)s = i (V.YV Y ~ VyV.Y ” V[x,F]) (s) 

for all sections s and vector fields X and Y. 

Of course, one should check that the given expression for lo is really a 
2-form, meaning that the value of lo(X,Y) at a point z depends only on 
the values of X and Y at z, and that it does not depend on the choice of 
section s, provided only that s(z) ^ 0. One way to do this is to compute w 
in a local isometric trivialization, as in the following result. (See Exercise 3 
for a different approach.) 

Proposition 23.5 Let so be a local isometric trivialization of L and let 6 
be the associated connection 1-form. Then the curvature 2-form lo of V is 
expressed locally as 

lo = d9. 

In particular, lo is a closed 2-form. 

Proof. The computation is precisely the same as in the proof of Proposition 
22.3 in the Euclidean case. ■ 

A locally defined 1-form 9 satisfying d9 = lo is called a (local) symplectic 
potential for lo. Our next result says that every symplectic potential is the 
connection 1-form for some local isometric trivialization of L. 

Proposition 23.6 Let (L, V) be a Hermitian line bundle with connection 
over N with curvature 2-form to. For each point zq € N and 1-form 9 
defined in a neighborhood U of zq satisfying d9 = lo, there is a subneigh¬ 
borhood V CU of zq and a local isometric trivialization of L over V such 
that the connection 1-form of the trivialization is 9. 

Proof. Let so be any isometric trivializing section defined in a neighbor¬ 
hood of zq and let p be the associated connection 1-form. Since dfq — 9) = 0, 
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there is a subneighborhood V C U of zq on which rj — 6 = df, for some 
smooth function /. If si = e l f so, then 


Vx(sr) = iX(f)e if s 0 + e if S7x(s 0 ) 


= iX(f)e lf s 0 - irj(X)e lf s 0 
= ~ df(X))si. 


Thus, the connection 1-form associated with the local isometric trivializa- 
tion si is r) — df = 9. ■ 

Proposition 23.7 If (L^V 1 ) and (L 2 ,V 2 ) are Hermitian line bundles 
with connection over N , let L i 0 L 2 denote the line bundle over N for 
which the fiber over x is Ti >x 0 L 2>;z; , with the natural inner product induced 
by the inner products on L\ tX and L 2jX . Then there is a unique Hermitian 
connection V on L\ 0 L 2 with the property that 


Vx(si 0 s 2 ) = (VjfSi) 0 s 2 + si ® (V|s 2 ) 


/or all vector fields X on N and all smooth sections si of L\ and s 2 o/L 2 . 
The curvature 2-form u> for (Li 0 L 2 , V) is given by 


W = W1 + w 2 , 


where uj\ and w 2 are the curvature 2-forms for (L i, V 1 ) and (L 2 , V 2 ), re- 
spectively. 

The proof of this proposition is a straightforward exercise in “definition 
chasing” and is left as an exercise to the reader. 

Suppose that L is a Hermitian line bundle over N with connection V 
and curvature 2-form oj. Given a loop 7 : [a, b] —> N, we can construct a 
section s of L that is defined over 7 such that the covariant derivative of s 
in the directions along 7 is zero. Indeed, in a local isometric trivialization, 
such a section can be constructed as 



(23.5) 


The value of s at the endpoint of the loop will in general not agree with the 
value at the starting point, but will differ by multiplication by a constant 
of absolute value 1 . 

Definition 23.8 The holonomy of a loop 7 : [a,b] —> N is the unique 
constant a (of absolute value 1) such that s^yfb)) = as(j(a)), where s is a 
nonzero section defined over 7 that is covariantly constant in the directions 
of 7- 
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The value of the holonomy of 7 is easily seen to be independent of the 
value of s at the starting point, provided this starting value is nonzero. 

Suppose that S' is a compact, oriented surface with boundary in N whose 
boundary dS is a loop. It is not hard to show that the holonomy around 
dS can be computed as 

holonomy (dS) = exp | i J . (23.6) 

Indeed, if S is contained in the domain of a local isometric trivializa- 
tion, then this result follows from (23.5) by means of Stoke’s theorem 
(Sect. 21.1.2). 

Now, if S is a closed (i.e., boundaryless) surface, its boundary is the 
trivial loop, which has a holonomy that is trivial, that is, equal to 1. (Think 
of approximating S by a surface for which the boundary is a very small 
loop.) Thus, for any closed surface S, (23.6) gives 


exp 



dS = 0. 


(23.7) 


Equivalently, we have 

/ uel (23.8) 

271 " Js 

The condition (23.8) says that oj/(2n) is an integral 2-form. Clearly, not 
every closed 2 -form satisfies this property. 

The closedness of w (Proposition 23.5) and the condition (23.8) represent 
necessary conditions that the curvature of a Hermitian connection must 
satisfy. It turns out that these two conditions are also sufficient. 


Theorem 23.9 Suppose w is a closed 2-form on a manifold N for which 
ui/(2tt) is integral in the sense of (23.8). Then there exists a Hermitian 
line bundle L over N with Hermitian connection V such that the curvature 
of V is equal to ui. If, in addition, N is simply connected, then (L,V) is 
unique up to equivalence. 

See Sect. 8.3 of [45] for a proof of this result. An equivalence of two 
Hermitian line bundles Li and L 2 with Hermitian connection over A' is a 
diffeomorphism $ : L\ — > L 2 such that for each x £ N, the restriction of 

to 7 r]" 1 ({a;}) is an isometric linear map onto tt^ 1 ({x}) and such that for 
each section s of L\, we have 


$(V.y(s)) = V,y($( S )). 


We now have the necessary tools to proceed with the program of geo¬ 
metric quantization on symplectic manifolds. 
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23.3 Prequantization 

The first step in the program of geometric quantization for a symplectic 
manifold (iV, oj) is to construct a Hermitian line bundle L over N with 
Hernritian connection for which the curvature 2-form is equal to ui/h. The¬ 
orem 23.9 gives the condition for the existence of such a bundle. 

Definition 23.10 A symplectic manifold (N,u>) is quantizable (for a 
particular value of h) if 



for every closed surface S in N. 

Note that if (N, ui) is quantizable for a given value Hq of Planck’s con¬ 
stant, then (N,us) is also quantizable for h= ho/k for every positive integer 
k. Indeed, according to Proposition 23.7, if L is a Hermitian line bundle 
with connection having curvature uj/Hq, then (the tensor product of 
L with itself k times) is a Hermitian line bundle with connection having 
curvature u/(Ho/k). 

For the remainder of this chapter, we will assume that N is a quantizable 
symplectic manifold with symplectic form to and that (L, V) is a fixed 
Hermitian line bundle with connection of N with curvature co/h. 

If L is a Hermitian line bundle over a symplectic manifold N, we say 
that a measurable section s of L is square integrable if 



.Jn 


is finite, where A is the Liouville volume form on N. Given two square- 
integrable sections si and S 2 of L, we define the inner product of Si and 
s 2 by 



(23.9) 


We use parentheses to denote the pointwise inner product (si(x), s 2 (x)) 
of two sections si and s 2 , which is a function on N, and we use angled 
brackets to denote the global inner product (si,s 2 ) of the sections, which 
is a number. 

Definition 23.11 The prequantum Hilbert space for N is the space of 
equivalence classes of square-integrable sections of L, where two sections are 
equivalent if they are equal almost everywhere with respect to the Liouville 
volume measure. 


Definition 23.12 If f is a smooth complex-valued function on N, the pre¬ 
quantum operator Q pie (f) is the unbounded operator on the prequantum 
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Hilbert space given by 


Qpre(f) = lhS7x > + /, 

where f represents the operation of multiplying a section by f. 

Proposition 23.13 If f is real-valued, then Q pie (f) is symmetric on the 
space of smooth compactly supported sections of L. 

Proof. Let Si and S 2 be smooth, compactly supported sections of L and let 
<f>^ denote the Hamiltonian flow generated by /. For all sufficiently small 
t, every point in the supports of si and s 2 will contained in the domain of 
Furthermore, by Liouville’s theorem, the value of 


/ [(si,s 2 ) o$ t ] A 
J N 


is independent of t. If we differentiate this relation with respect to t and 
evaluate at t = 0, we obtain, by (23.3), 


0= [ [(V A - / ( Sl ),s 2 ) + ( Sl ,V. Y/ (s 2 ))] A. 

JN 

Thus, Vx/ is a skew-symmetric operator on the space of smooth, compactly 
supported sections, from which it follows that Q pr e(/) is symmetric. ■ 

By the product rule for covariant derivatives and the identity Xf(f) = 
{/> /} = 0) we see that the two terms in the definition of Q pre (/) commute. 
We would then expect the exponential to decompose as a product 

of two exponentials. One of these exponentials is just e lt f and the other 
may be constructed as “parallel transport along the flow generated by Xf." 
Thus, if the flow generated by Xf is complete, it is possible to use Stone’s 
theorem to construct Q pre (/) as a self-adjoint operator on a domain that 
includes the space of smooth compactly supported sections. 

Proposition 23.14 For any f,g € C°°(X), we have 
^[<2 P re(/),Qpre(s)] = <3pre({/, ff}), 

where the equality holds as operators on the space of smooth sections of L. 

Proof. The argument is precisely the same as in Proposition 22.1 in the 
R 2n case. ■ 

As we have seen already in Sect. 22.3 in the M 2 " case, the prequantum 
Hilbert space is “too large” to be considered the quantization of N. 
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23.4 Polarizations 

In the R" case, we have the position, momentum, and holomorphic sub¬ 
spaces (Definition 22.7), consisting of functions that depend only on x, p, 
or z, in the sense that the covariant derivatives of functions in the direc¬ 
tions of p, x, and z are zero. In each case, the “basic observables” of the 
particular representation (the xfi s, the /Jy’s, and the zfis, respectively) act 
simply as multiplication operators. 

To generalize this to a symplectic manifold N of dimension 2 n, we may 
think of choosing n functions ot\,... ,a n on N that are “independent,” in 
the sense that dai,..., da n are linearly independent at each point. We as¬ 
sume that the functions ay Poisson commute ({ay, a*,} = 0), which makes 
it reasonable to hope that the quantizations of the ay’s could act as (com¬ 
muting) multiplication operators. For each z £ N, we let P z be the n- 
dimensional space of directions in which the ay’s are constant, that is, 
the intersection of the kernels of da i,..., da n . Since we wish to allow the 
functions ay to be complex valued, P z should be thought of as a subspace 
of the complexified tangent space Tfr(N). The idea is that our quantum 
Hilbert space should consist of sections of a prequantum line bundle that 
are covariantly constant in the directions of P. 

Now, at each point z, the Hamiltonian vector field X aj will belong to 
P z , because 

day (X ak ) — X ak (ay) — {a^, ay } — 0. 

Furthermore, since the day’s are linearly independent, the X a ds are also 
independent, since X a . is obtained from day by an isomorphism of tangent 
and cotangent spaces. Thus, the X aj ’s must actually span P z at each point, 
by a dimension count. Since also ui(X a , X ak ) = — {aj,a k } = 0, we con¬ 
clude that ui is identically zero on P z . Furthermore, if X and Y are vector 
fields lying in P at each point, we can express them as 

X = aj (z)X aj , Y = b J {z)X aj , 

for some smooth functions ay and bj. Then 

\X,Y\ = aj (z)X aj (b k )X ak - b k (z)X ak (ai)X ai , 

because [X a .,X ak ] = X^ a . ak y = 0. Thus, the commutator of two vector 
fields lying in P will again lie in P. 

Definition 23.15 For any z € TV, a subspace P of T Z N is said to be 
Lagrangian if dim P = n and uj(X, Y) = 0 for all X, Y £ P. 

Definition 23.16 A polarization of a symplectic manifold N is a choice 
at each point z £ N of a Lagrangian subspace P z C Tf(X), satisfying the 
following two conditions. 
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1. If two complex vector fields X and Y lie in P z at each point z, then 
so does [X , Y], 

2. The dimension of P z D P z is constant. 

The first condition is called integrability , and we have motivated this 
condition in the discussion preceding the definition. The second condition 
is a technical one that prevents problems with certain constructions, such 
as the pairing map. (Although, in practice, one sometimes needs to work 
with “polarizations” in which the second condition is violated, extra care 
is needed in such cases.) 

There is one small inaccuracy in our discussion of polarizations: For 
purely conventional reasons, the quantum Hilbert space is defined as the 
space of sections that are covariantly constant in the direction of P , rather 
than P. Thus, P should really be the complex conjugate of the space of 
directions in which the sections are constant. This convention, however, 
makes no difference to the definition of a polarization, since if P satisfies 
the conditions of Definition 23.16, so does P. 

Example 23.17 If M is any smooth manifold, let N = T*M be the cotan¬ 
gent bundle of M, equipped with the canonical 2-form oj (Example 21.2). 
For each z £ T*M , let P z be the complexification of the tangent space 
to the fiber TfM. Then P is a polarization on T*M , called the vertical 
polarization. 

Proof. If {xj} is any local coordinate system on M, let {xj , pj } be the 
associated local coordinate system on T*M. The canonical 2-form is given 
by oj = dpj A dxj. At each point z £ T*M, the vertical subspace P z is 
spanned by the vectors d/dpj. Since w(<9/ dpj,d/dpu) = 0, we see that P z 
is Lagrangian. Furthermore, P z = P z at every point, and so dim P z D P z 
has the constant value n = dim M. Finally, the integrability of P follows by 
computing the commutator of two vector fields of the form fj(x,p) d/dpj, 
which will again be a linear combination of the d/dpj’s. Integrability also 
follows from the easy direction of the Frobenius theorem, since the fibers 
of T*M are integral submanifolds for P. m 

We may identify two special classes of polarizations, those that are purely 
real (i.e., P- = P z for all z £ N ) and those that are purely complex (i.e., 
P Z HP Z = {0} for all z £ N). The vertical polarization, for example, is 
purely real. 

If P is purely real, the integrability of P implies, by the Frobenius theo¬ 
rem, that every point in N is contained in a unique submanifold R that is 
maximal in the class of connected integral submanifolds for P. [An integral 
submanifold R for P is submanifold for which T^(R) = P z for all z £ R.] 
We will refer to the maximal connected, integral submanifolds of a purely 
real polarization as the leaves of the polarization. 

In general, the leaves may not be embedded submanifolds of N. Suppose, 
for example, that N = S 1 x S 1 , with oj = ddAdtf, where 9 and <fi are angular 
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coordinates on the two copies of S 1 . Then the tangent space to N at any 
point may be identified with R 2 by means of the basis {d/d0,d/d(f>}. We 
may define a polarization P on N by defining P z to be the span of the 
vector 

d d 

d9 d(j>' 

for some fixed irrational number a. Each leaf of P is then a set of the form 
{(e i6 °e it , e iat ) £ S 1 x S 1 \ t £ R} , 

for some Oq, which is an “irrational line” in S 1 x S 1 . Each leaf is then 
dense in S' 1 x S' 1 and, thus, not embedded. We will need to avoid such 
pathological examples if we hope to successfully carry out the program 
of geometric quantization with respect to a real polarization. Much more 
information about the structure of real polarizations may be found in Sects. 
4.5-4.7 of [45]. 

We now consider some elementary results concerning purely complex 
polarizations. 

Proposition 23.18 Suppose P is a purely complex polarization on N. For 
each z £ N, let J z : T/rN —» T^N be the unique linear map such that J z = 
il on P z and J z = — il on P z . Then J z is real (i.eit maps the real tangent 
space to itself) and to is J~-invariant [i.e., ui( J Z X\, J Z X 2 ) = w(A i,X 2 ) for 
all X \, X 2 € T^N /. 

Proof. Since the restriction of J z to P z is the complex-conjugate of its 
restriction to P z , the map J z commutes with complex conjugation and thus 
maps real vectors (those satisfying X = A") to real vectors. Meanwhile, 
since P z is Lagrangian and ui is real, P z is also Lagrangian. Given two 
vectors X\ =Y 1 + Z\ and X 2 = Y 2 + Z 2l with Yj £ P z and Z 3 £ P Zl we 
compute that 


w(JzAd, J Z X 2 ) 

= uj{iY\,iY 2 ) + co(iYi, —iZ 2 ) + u>{—iZi,iY 2 ) + ui(—iZi,—iZ 2 ) 

= Z 2 ) + uj(Zi,Y 2 ). 

A similar calculation gives the same value for u>(Xi,X 2 ), showing that w 
is J 2 -invariant. ■ 

A complex structure on a 2n-dimensional manifold N is a collection of 
“holomorphic” coordinate systems that cover N and such that the transi¬ 
tion maps between coordinate systems are holomorphic as maps between 
open sets in R 2 " = C n . At each point z £ N, there is a linear map 
J z : T.N —> T.N defined by the expression 
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where the Xj’s and yj s are the real and imaginary parts of holomorphic 
coordinates. This map is independent of the choice of holomorphic coordi¬ 
nates and satisfies J 2 = —I. At each point z G N 1 the complexified tangent 
space TfN can be decomposed into eigenspaces for J z with eigenvalues i 
and — i; these are called the (1,0)- and (0, l)-tangent spaces, respectively. 

Meanwhile, if N is any 2n-dimensional manifold and J is a smoothly 
varying family of linear maps on each tangent space satisfying J 2 = — I for 
all z, then J is called an almost-complex structure. Given an almost complex 
structure, we can divide the complexified tangent space into ±z eigenspaces 
for J. The Newlander-Nirenberg theorem asserts that if the family of +i 
eigenspaces is integrable (in the sense of Point 1 of Definition 23.16), then 
there exists a unique complex structure on N for which these are the (1,0)- 
tangent spaces. 

A purely complex polarization P gives rise to a complex structure on N, 
as follows. By Proposition 23.18 and the Newlander-Nirenberg theorem, 
there is a unique complex structure on N for which P z is the (1, 0)-tangent 
space, for all z € N. 

Now, we have already seen in the R 2ra case that some purely complex 
polarizations behave better than others. [Compare (22.11) to (22.13)]. The 
geometric condition that characterizes the “good” polarizations is the fol¬ 
lowing. 

Definition 23.19 For any purely complex polarization P , let J be the 
unique almost-complex structure on N such that J z = il on P z and J z = 
—il on P z . We say that P is a Kahler polarization if the bilinear form 

g(X,Y):=oj(X,J z Y) (23.10) 

is positive definite for each z € N. 

For any purely complex polarization, the bilinear form g in (23.10) is 
symmetric, as the reader may easily verify using the J z -invariance of u. 

Suppose, for example, that we identify R 2 with C by the map z = x—icxp , 
for some fixed a > 0. If we define a purely complex polarization on R 2 by 
taking P z to be the span of the vector d/dz in (22.9), then (Exercise 4), P 
is a Kahler polarization. 


23.5 Quantization Without Half-Forms 

To construct a prequantum Hilbert space, we must choose a line bundle 
(L, V) over (N,u>) having curvature w/h. Such a bundle exists if ui/h is 
an integral 2-form and is unique (up to equivalence) if N is simply con¬ 
nected. To pass to the quantum Hilbert space, we must make a substantial 
additional choice, that of a polarization P on N. In our first attempt at 
defining the quantum Hilbert space associated with P, we consider the 
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space of sections of (L, V) that are covariantly constant in the directions 
of P. Although this approach works reasonably well for a purely complex 
polarization, in the case of a purely real polarization, there typically are no 
square-integrablc sections satisfying this condition. (Indeed, we have seen 
this problem already in the R 2n case, in Sect. 22.4.) In the next section, we 
will introduce half-forms to address this problem. 

In the remainder of the chapter, we will let P denote a fixed polarization 
on N. 


23.5.1 The General Case 

As we have remarked, it is customary to consider sections that are 
covariantly constant in the directions of P rather than in the directions 
of P. 

Definition 23.20 A smooth section s of L is polarized (with respect to 

P) if 

V x s = 0 (23.11) 

for every vector field X lying in P. The quantum Hilbert space associated 
with P is the closure in the prequantum Hilbert space of the space of smooth, 
square-integrable, polarized sections of L. 

As in the Euclidean case, we will simply restrict the prequantum opera¬ 
tors to the quantum Hilbert space, in those cases where Q pre (/) preserves 
the space of polarized sections. 

Definition 23.21 A smooth, complex-valued function f on N is quanti- 
zable with respect to P if Q pre (/) preserves the space of smooth sections 
that are polarized with respect to P. 

The following definition will provide a natural geometric condition guar¬ 
anteeing quantizability of a function. 

Definition 23.22 A possibly complex vector field X preserves a polar¬ 
ization P if for every vector field Y lying in P, the vector field [A', Y\ also 
lies in P. 

Note that if X lies in P, then X preserves P, by the integrability assump¬ 
tion on P. There will typically be, however, many vector fields that do not 
lie in P but nevertheless preserve P. 

If A is a real vector field, then [A", Y] is the same as the Lie derivative 
Cx(Y). It is then not hard to show that X preserves P if and only if the 
flow generated by X preserves P, that is, if and only if (4> t )*(P~) = P$ t ( z ) 
for all z and t , where <f> is the flow of X. Furthermore, if X is real, then A 
preserves P if and only if A preserves P. 


23.5 Quantization Without Half-Forms 497 


Example 23.23 If N = T*M for some manifold M and P is the vertical 
polarization on N, then a Hamiltonian vector field Xf preserves P if and 
only if f = fi + fa, where fa is constant on each fiber and fa is linear on 
each fiber. 


Proof. In local coordinates {xj,Pj}, a vector field X lying in P has the 
form X = gj d/dpj. Thus, 


[X/,*\ 


df d d 


' df d d 

dpj dxj 1 dpk 


dxj dpj ’ ^ k dpk 


This commutator will consist of three “good” terms, which involve only 
p-derivatives, along with the following “bad” term: 

d 2 f d 

9k f'l o o 

OpkOPj OXj 

If d 2 f /dpkdpj is 0 for all j and k, then the bad term vanishes and [ Xf , X] 
again lies in P. Conversely, if we want the bad term to vanish for each 
choice of the coefficient functions gj , we must have d 2 f /dpudpj = 0 for all 
j and k. Thus, for each fixed value of x, f must contain only terms that 
are independent of p and terms that are linear in p. ■ 

We now identify the condition for quantizability of functions. 

Theorem 23.24 For any smooth, complex-valued function f on N, if the 
Hamiltonian vector field Xf preserves P, then f is quantizable. 

Since we do not assume that / is real-valued, the condition that Xf 
preserve P is not equivalent to the condition that Xf preserve P. 

Proof. Given a polarized section s, we apply Q pre (/) to s and then test 
whether Q pie (f)s is still polarized, by applying Vx for some vector field 
X lying in P. To this end, it is useful to compute the commutator of Vi¬ 
and Qpreif), as follows: 

[Vx,Qpre(/)] = ih [Vi, V*,] + [Vi,/] 

= ^ (V[X,X,] - \MX,Xf)j +X(f) 

= ihW[ X< x,], (23.12) 


where we have used that 

= ~u{Xf,X) = —df(X) = —X(f), 

by Definition 21.6. Since Xf preserves P, the vector field [X 7 Xf] again lies 
in P and, thus, 


Vx(Qpre(/)s) = Qpre{f)VxS + ihV[x,X f ]S = 0, 
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for every polarized section s, showing that Q pre (f)s is again polarized. ■ 

The converse of Theorem 23.24 is false in general. After all, as we will see 
in the following subsections, for a given polarization, there may not be any 
nonzero globally defined polarized sections, in which case, any function is 
quantizable. On the other hand, it can be shown that if Q pre (/) preserves 
the space of locally defined polarized sections, then the Hamiltonian flow 
generated by / must preserve P. This result follows by the same reasoning 
as in the proof of Theorem 23.24, once we know that there are sufficiently 
many locally defined polarized sections. We will establish such an existence 
result for purely real and purely complex polarizations in the following 
subsections; for the general case, see the discussion following Definition 
9.1.1 in [45], 

A special case of Theorem 23.24 is provided by “polarized functions,” 
that is, functions / for which X'(f) = 0 for all vector fields X lying in 
P. For such an /, the action of Q pre (/) on the quantum space is simply 
multiplication by /, as we anticipated in the introductory discussion in 
Sect. 23.4. 

Proposition 23.25 If f is a smooth, complex-valued function on N and 
the derivatives of f in the P directions are zero, then Q pre (f) preserves the 
space P-polarized sections, and the restriction of Q pre (f) to this space is 
simply multiplication by f. 

We have already seen special cases of this result in the R 2rl case; see the 
discussion following Proposition 22.11. 

Proof. If the derivatives of / in the direction of P are zero, then for X £ P, 
we have 

0 = X(f) = df(X)=uj(X f ,X), 

meaning that Xf is in the w-orthogonal complement of P. But since P 
is Lagrangian, this complement is just P. Thus, Xf belongs to P and, in 
particular, Xf preserves P, so that / is quantizable, by Theorem 23.24. 
Furthermore, V x f s = 0 for any P-polarized section s, leaving only the fs 
term in the formula for Q pTe (f)s. ■ 

23.5.2 The Real Case 

In the M 2n case, we have already computed the space of polarized sections 
for the vertical polarization in Proposition 22.8. As we observed there, there 
are no nonzero polarized sections that are square integrablc over R 2n . The 
same difficulty is easily seen to arise for the vertical polarization on any 
cotangent bundle N = T*M. In Sect. 23.6, we will introduce half-forms to 
deal with this failure of square integrability. 

We now examine properties of general real polarizations. We will see that 
polarized sections always exist locally, but not always globally. 
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Proposition 23.26 If P is a purely real polarization on N, then for any 
zo £ N, there exist a neighborhood U of zo and a P-polarized section s of 
L defined over U such that s(zo) 7 ^ 0. 

Proof. According to the local form of the Frobenius theorem, we can find 
a neighborhood U of zq and a diffeomorphism $ of U with a neighborhood 
V of the origin in R" x R" such that under <f>, the polarization P looks like 
the vertical polarization. That is to say, for each z € U, the image of P z 
under $,( 2 ) is just the span of the vectors d/dy 1 ,..., d/dy n , where the y's 
are the coordinates on the second copy of R". By shrinking U if necessary, 
we can assume that L can be trivialized over U and that the open set V is 
the product of a ball B 1 centered at the origin in the first copy of R" with 
a ball B 2 centered at the origin in the second copy of R n . 

Let 6 be the connection 1 -form for an isometric trivialization of L over 
U and let 9 = ($ _ 1 )*( 0 ). Since the subspaces P z are Lagrangian, the 
restriction of 9 to the each set of the form {x} x B 2 is closed. Since B 2 
is simply connected, there exists, for each x £ B\, a function / x on B 2 
such that the restriction of 9 to {x} x B 2 equals df x . If we assume that 
/ x ( 0 ) = 0 , then / x (y) will be smooth as a function of (x, y), since it is 
obtained simply by integrating 9 from 0 to y in the vertical directions. 

Now, let <f be any smooth function on B\ with <p(0) ^ 0 and define a 
function if on Bi x B 2 by 

if(x,y) = <f>{x)e % ^ y ^ h . 

For any “vertical” vector field X (i.e., one where X is a linear combination 
of d/dyi ,..., d/dy n with smooth coefficients), we compute that 

Xif = j{Xf x )if = l -df x (X)4> = l -9(X)4>. 

Thus, 

(x - l -9(X )) if = 0 , 

from which it follows that the function if := if o $ represents a polarized 
section on U in the given local trivialization of L. ■ 

The existence of nonzero global polarized sections for a purely real po¬ 
larization P is a more delicate question. If the leaves of P are not embed¬ 
ded, there is little chance of finding global polarized sections. Even if the 
leaves are embedded, there are obstructions. Since the tangent spaces to 
the leaves of P are Lagrangian subspaces, the restriction of L to R has zero 
curvature. There may, nevertheless, be loops in R for which the holonomy 
(Definition 23.8) is nontrivial. After all, if a loop 7 in R is not the bound¬ 
ary of a surface S in R, then we cannot apply (23.6) to conclude that the 
holonomy of 7 is trivial. The collection of holonomies for a leaf R of P can 
be understood as a homomorphism of 717 ( R ) into S 1 . If there is any loop in 
R with nontrivial holonomy, any polarized section of L must vanish on R. 
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Definition 23.27 A submanifold R of N is said to be Lagrangian if dim 
R = n and T Z R is a Lagrangian subspace of T Z N for each z € R. A 
Lagrangian submanifold R of N is said to be Bohr—Sommerf eld (with 
respect to L) if the holonomy in L of every loop in R is trivial. 

We may summarize the preceding discussion as follows. 

Conclusion 23.28 For a purely real polarization P with embedded leaves, 
a polarized section vanishes on every leaf of P that is not Bohr-Sommerfeld. 

Our next example suggests that when the leaves are compact, the Bohr- 
Sommerfeld leaves typically form a discrete set within the set of all leaves. 

Example 23.29 Let N = S' 1 x M , equipped with the symplectic form oj = 
dxAd(f >, where x is the linear coordinate on R. and <f> is the angular coordinate 
on S 1 . Let L be the trivial line bundle on N, with sections that are identified 
with smooth functions. Let 9 = x dcj) and define a connection V on L by 
V y = X — ( i/h)9(X ), and let P be the purely real polarization of N for 
which the leaves are the sets of the form S 1 x {x}, for x € R. Then a leaf 
S 1 x {x} is Bohr-Sommerfeld if and only if x/h is an integer. 

In particular, there are no nonzero, smooth polarized sections of L. 

Proof. If we define a section locally on a given leaf S 1 x {x} as 

s(» = ce ix ^ h 

for some nonzero constant c, then it is easily verified that Vg/g^s = 0. After 
one trip around the circle, the value of this section will be the starting value 
times e 2 ’ r “/ R . Thus, the holonomy around S ’ 1 x {cc} is trivial if and only if 
x/h is an integer. A polarized section, then, would have to vanish on all the 
leaves where x/h is not an integer. Since such leaves form a dense subset 
of N, any smooth polarized section must be identically zero. ■ 

Even in cases, such as Example 23.29, where there are no smooth po¬ 
larized sections, one may still consider “distributional” polarized sections 
that are supported on the Bohr-Sommerfeld leaves, as on pp. 251-252 of 
[45]. 

23.5.3 The Complex Case 

In Proposition 22.8, we computed the space of polarized sections for a cer¬ 
tain positive, translation-invariant polarization on R 2ra , namely the one for 
which P z is spanned by the vectors d/dzj in (22.9). The situation here 
is better than that for the vertical polarization, in that there are nonzero 
polarized sections that are square integrable over R 2rl . Recall, however, 
that if we take our polarization to be spanned by the vectors d/dzj, then 
[see (22.13)], then there are no nonzero square-integrable polarized sec¬ 
tions. This example indicates the importance of the positivity condition in 
Definition 23.19. 
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For our next example, we consider the example of the unit disk D, 
equipped with the unique (up to a constant) symplectic form that is in¬ 
variant under the group of fractional linear transformations that map D 
onto D. In this case, the quantum Hilbert space can be identified with a 
weighted Bergman space , that is, an L 2 space of holomorphic functions on 
D with respect to a measure of the form (1 — \z\ ) v dx dy. 

Example 23.30 Let N be the unit disk D C R 2 equipped with the following 
symplectic form: 

co = 4(1 — |z| 2 )^ 2 dx A dy = (1 — r 2 )~ 2 r dr A d</>, 

where (r, </>) are the usual polar coordinates. Let L be the trivial line bun¬ 
dle over D with connection V.y = X — ( i/h)6 , where 9 is the symplectic 
potential for co given by 

r 2 

9 = 2-- ■= d(t>. 

1 — r* 

Define a complex polarization on D by letting P z = Span (d/dz), where 
z = x — iy. In that case, holomorphic sections s have the form 

s{z) = F{z){1~\z\ 2 ) 1 /\ 

where F is holomorphic. The norm of such a section is computed as 

ll s l| 2 = 4 [ \F{z)\ 2 {l-\z\ 2 ) 2 / h ~ 2 dxdy. 

J D 

As in the case of the plane, the seemingly unnatural definition z = x — iy 
is necessary to obtain a Kahler polarization. If we used z = x + iy instead, 
the holomorphic sections would have the form E(;r)(l — \z\ i )~ 1 / h , in which 
case there would be no nonzero, square-integrable holomorphic sections. 
Proof. See Exercise 8 . ■ 

We now consider general purely complex polarizations. Recall that, by 
Proposition 23.18 and the Newlander Nirenberg theorem, N has a unique 
complex structure for which P z is the (1, 0)-subspace of T^N, for all z € IV. 
As in the purely real case, there always exist local polarized sections. 

Theorem 23.31 Suppose P is a purely complex polarization on N. Then 
for each z o £ N, there exists a P-polarized section s of L, defined in a 
neighborhood of zq, such that s(zq) ^ 0. 

We defer the proof of Theorem 23.31 until the end of this subsection. 
Suppose s is as in the theorem and s' is any other locally defined P- 
polarized section. Then s' = fs for some unique complex-valued function /, 
and by the product rule for covariant derivatives, X(f) = 0 for all X £ P z . 
This means that / is holomorphic with respect to the complex structure 
on N for which P is the (1,0)-tangent space. Thus, we have a preferred 
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family of local trivializations of L (the ones given by nonvanishing local 
polarized sections) such that the “ratio” of any two such trivializations is 
a holomorphic function. This means that we have given L the structure of 
a “holomorphic line bundle” over the complex manifold N in such a way 
that the holomorphic sections of L are precisely the polarized sections with 
respect to P. 

Arguing as in the proof of Proposition 14.15, it is not hard to show that 
for a purely complex polarization, the space of square-integrable polarized 
sections of L forms a closed subspace of the prequantum Hilbert space. For 
any z £ N, if we choose a linear identification of the fiber of L over z with 
C, then the map s H > s(z) is a linear functional on the quantum Hilbert 
space. It is not hard to show, as in the proof of Proposition 14.15, that 
this linear functional is continuous, and can therefore be represented as an 
inner product with a unique element of the quantum Hilbert space. 


Definition 23.32 Let P be a purely complex polarization on N. For each 
z € N, choose a linear identification of the fiber of L over z with C. Then 
the coherent state Xz is the unique element of the quantum Hilbert space 
with respect to P such that 

s(z) = (Xz,s) 


for all s. 

Suppose N = R 2 with a polarization given by P z = Span (d/dz), where 
z = x — iap. If we use the symplectic potential 8 = (p dx — x dp)/ 2, 
then, as in the proof of Proposition 22.14, the quantum Hilbert space is 
naturally identifiable with the Segal-Bargmann space. In this case, the 
coherent states can be read off from Proposition 14.17. 

It could happen that Xz = 0 f° r some z £ N, or even for all z € N, 
depending on the choice of P. Even if Xz is nonzero, Xz is only well defined 
up to multiplication by a constant, because we must choose an identification 
of L -1 ({z}) with C. But if Xz ^ 0, the one-dimensional subspace spanned 
by Xz is independent of this choice. That is to say, whenever Xz ^ 0, the 
span of Xz is a well-defined element of the projective space 'P(H), where 
H is the quantum Hilbert space. 

Recall, meanwhile, that if (L,V) is a Hermitian line bundle with con¬ 
nection having curvature ui/h, then for any positive integer n, there is a 
natural Hermitian connection on L® k having curvature kuj/h. This means 
that if L is a prequantum line bundle with one value h 0 of Planck’s con¬ 
stant, then is a prequantum line bundle with Planck’s constant equal 
to ho/k. The following result shows that in the case of compact symplectic 
manifolds with Kahler polarizations, things behave nicely when k tends to 
infinity. 

Theorem 23.33 Assume N is compact and let P be a Kahler polarization 
on N. For each positive integer k, let H*, denote the space of polarized 
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sections of L® k . Then for all k , H*, is finite dimensional. Furthermore, for 
all sufficiently large k , we have the following results. First, the coherent 
state Xz G H*, is nonzero for each z G N. Second, the map 

z Span(x s ) 

is an antiholomorphic embedding of N into 'P(Hfc). 

The finite dimensionality of Hj. is a standard result in the theory of com¬ 
pact, complex manifolds. The embedding of N into V(H /,) is the Kodaira 
embedding theorem , which we will not prove here. The Kodaira embedding 
theorem implies, in particular, that there exist nonzero, globally defined 
polarized sections of L® k , at least for large k. Since the value of Planck’s 
constant for is Ho/k, Planck’s constant tends to zero as k tends to 
infinity. Thus, the study of holomorphic sections of L® k for large k can be 
understood as being part of semiclassical analysis. 

We now turn to the proof of Theorem 23.31, in which we will make 
use of basic properties of complex-valued differential forms on complex 
manifolds. (“Complex-valued” means that we allow the value of a fc-form on 
a collection of k tangent vectors to be a complex number.) In a holomorphic 
local coordinate system z\,...,z n , each form can be written as a wedge 
product of the dzj ’s and dzj ’s. A form is called a (p, g)-form if it is a 
linear combination of wedge products of p factors involving the dzf s and 
q factors involving the dzf s. Each form can be decomposed uniquely as a 
linear combination of (p, g)-forms for various values of p and q , and this 
decomposition does not depend on the choice of holomorphic coordinate 
system. If a is a (p, ( 7 )-form, then da will be a linear combination of a 
(p + 1, g)-form and a (p, q + l)-form. We define operators d and d in such 
a way that d maps (p, g)-forms to (p + 1, < 7 )-forms, d maps (p, g)-forms to 
(p, q + 1) forms, and d = d + d. In particular, 


d(f dz h A • • • A dz jp A dz kl A • • • A dz kq ) 



and similarly for d with ( df/dzi ) dzi replaced by (df/dzi) dzi. 

The maps d and d satisfy the identities: 

dd = 88 = 0 
dd = —dd. 

The Dolbeault lemma states that if a (p, gj-form a satisfies da = 0, then a 
can be expressed locally as d/3 for some (p — 1, g)-form, and if da = 0, then 
a can be expressed locally as d(3 for some (p, q — l)-form. A (p, 0)-form a 
is said to be holomorphic if it can be expressed in holomorphic coordinates 
as a sum of terms of the form 


f(z) dzj 1 A • • • A dzj i 
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where the coefficient functions / is holomorphic. A (p, 0)-form a is holomor- 
phic if and only if da = 0. If a holomorphic (p, 0)-form a satisfies da = 0 
(or, equivalently, da = 0), then a can be written locally as a — d/3 , for 
some holomorphic ( p — 1,0)-form. 

Let P be a purely complex polarization on N and let J be the almost- 
complex structure for which P z is the (1,0)-tangent space at z. Since 
(Proposition 23.18), oj is J-invariant, it follows (Exercise 6) that w is a 
(1, l)-form. 

Lemma 23.34 Let N be a complex manifold with almost-complex struc¬ 
ture J and let oj be a closed, J-invariant, real-valued (l,l)-form on N. Then 
for every point zo £ N, there exists a smooth, real-valued function k defined 
in a neighborhood of Zq such that iddn = oj. 

In the case that N is Kahler [i.e., the case where oj(X,JX) > 0], a 
function k as in the lemma is called a (local) Kahler potential for N. 
Proof. By assumption, doj = (d + 8)oj = 0, from which it follows that 
duj = Boo = 0, because doj is a (2, l)-form and 8oj is a (1, 2) form. Thus, by 
the Dolbeault lemma, there exists a (1,0)-form a, defined in a neighborhood 
of z 0 , such that da = w. Then da is a (2, 0)-form that satisfies 

dda = —dda = —du; = 0. 

This shows that da is actually a holomorphic (2,0)-form. 

Since also dda = 0, we see that da is closed, which means that there 
exists a holomorphic 1-form r/, defined in a possibly smaller neighborhood 
of zo, such that dr] = drj = da. Thus, d(a — r]) = 0, and so by the Dolbeault 
lemma, there exists a function g, defined in a neighborhood of zq , such that 
dg = a — rj. Thus, a = rj + dg and so 


oj = da = ddg = — ddg 

since drj = 0. The function n := ig then satisfies iddn = oj. 

Now, a calculation in coordinates (Exercise 7) shows that the map k i—>• 
iddf is real, that is, it maps real-valued functions to real-valued 2-forms. 
Since oj is real, the operator idd must map the imaginary part of k to zero. 
Thus, iddn is unchanged if k is replaced by its real part. ■ 

Proof of Theorem 23.31. Let n be as in Lemma 23.34 and let 9 be the 
real-valued 1-form given by 

9 = Im(9«;) = — (dn — 8 k) . (23.13) 

Then because d 2 = 8 2 = 0, we have 

d9 = (d + 8)6 = ~7 )dd k — ddn) 


= OJ. 
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That is to say, 8 is a symplectic potential for u>. Thus, by Proposition 23.6, 
we can find a local isometric trivialization Sq of L for which the connection 
1-form is 8/h. 

For any vector X, we have 

Vx (e~ K ^a 0 ) = (-^*(«) - T, 0(X) ) e_K/Rs °’ (23 ' 14) 

where X(k) = dn{X) = 8 k(X) + 8n(X). Now, if X is of type (0,1), then 
dn(X) = 0, in which case, if we use (23.13), we find that the two terms on 
the right-hand side of (23.14) cancel. Thus, e - K /( 2R ) So is the desired local 
polarized section. ■ 


23.6 Quantization with Half-Forms: The Real Case 

In this section, we introduce a concept known as half-forms , which are 
designed to work around the problem that, in the case of real polarizations, 
there often do not exist any nonzero square-integrablc polarized sections. 

A polarized section s for a real polarization P tends to have infinite 
norm, because we may get infinity from integrating |s| 2 along the leaves of 
the polarization. To illustrate how half-forms work around this problem, 
consider the case of the vertical polarization on R 2 = T*R. Elements of the 
half-form Hilbert space will be representable in the form s ® \/dx, where s 
is a polarized section of L and where \fdx will be interpreted as a “section 
of the square root of the canonical bundle.” To compute the norm of such 
an object, we first square it at each point to obtain the quantity |s| 2 dx. 
Since s is polarized, |s| is a function of x only, independent of p. Thus, 
|s| 2 dx may be thought of as a 1-form on R, rather than on R 2 , which we 
may then integrate to obtain 


l|s|| 2 : = [ N 2 C) dx. 

Jr 

This procedure has two advantages over the one we used in Sect. 22.4, 
where we simply integrated |s| 2 itself over R. First, a version of this proce¬ 
dure works for real polarizations on general symplectic manifolds. Second, 
the half-form approach will allow quantized observables to be self-adjoint, 
which was not the case in Sect. 22.5 when we simply restricted prequan¬ 
tized observables to the polarized subspace. (See the discussion following 
Proposition 22.12.) 

Throughout this section, we assume that IV is a quantizable symplectic 
manifold, that L is a fixed prequantum line bundle over N, and that P is 
a fixed purely real polarization on N. 
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23.6.1 The Space of Leaves 

Recall that a leaf of P is a maximal connected, integral submanifold of 
P. We may then form the leaf space E (the set of all leaves of P) and a 
quotient map q : N —> 2 sending each point z C N to the unique leaf 
containing z. We may topologize S by defining a set U in S to be open if 
( 7 -1 (t/) is open in N. 

In order to be able to carry out the program of geometric quantization 
with respect to P, we must assume that S can be given the structure 
of a smooth, n-dimensional manifold in such a way that q : N — > S is 
smooth and such that the kernel of g* jZ is equal to P®, the intersection of 
P, with the real tangent space of P z . We abbreviate this assumption on 
S by saying that E is a smooth manifold. In the case N = T* M with the 
vertical polarization (Example 23.17), the leaf space S is a smooth manifold 
diffeomorphic to M. 

It should be emphasized that even if S is a smooth manifold, there is no 
canonical “volume measure” on S. Thus, our half-form Hilbert space will 
be defined in such a way that the pointwise “square” of an element will 
be an n-form, rather than a function, on the leaf space, which can then be 
integrated over the n-manifold S. 


23.6.2 The Canonical Bundle 

We now introduce the canonical bundle of a purely real polarization P, 
with sections that are a special sort of n-form on N, along with a notion 
of polarized section of the canonical bundle. If the leaf space S is a smooth 
manifold, the space of polarized sections of the canonical bundle can be 
identified with the space of all n- forms on the n-manifold S. 

Definition 23.35 The canonical bundle ICp of P is the real line bundle 
with sections that are n-forms a having the property that 

X_,a = 0 (23.15) 

for every vector field X lying in P. A section a of ICp is polarized if 

AM {da) = 0 (23.16) 

for every vector field X lying in P. 

If an n-form a satisfies (23.15), then a(Xi,... ,X n ) = 0 if any of the 
Xfs belongs to P. Thus, the value of a at any point z can be viewed as 
an n-linear, alternating functional on the quotient vector space T Z N /P®, 
where P® is the intersection of P z with the real tangent space. Since this 
quotient space is n-dimensional, we see that at each point, the space of 
possible values for a is one dimensional. 
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Meanwhile, if a satisfies (23.16), then at each point, da is an (n + 1)- 
linear, alternating functional on T Z N/Pf, which must be zero. Thus, for 
sections of /Cp, (23.16) is equivalent to the condition 

da = 0. (23.17) 

We can also introduce the complexified canonical bundle /Cp, the sections 
of which are complex-valued n-forms satisfying (23.15). We define a section 
of /Cp to be polarized if it satisfies (23.16). 

Example 23.36 Let N = T* R n = R 2n and let P be the vertical polariza¬ 
tion on N. Then an n-form a on R 2n is a section of Yip if and only if a 
is of the form 

a = /(x, p) dx i A • • • A dx n , (23.18) 

and a is a polarized section of Yip if and only if a is of the form 

a = g(x) dx\ A • • • A dx n , (23.19) 

for smooth functions f on R 2n and g on R”. 

Proof. If a contained any term involving dpj, the contraction of a with 
d/dpj would not be zero, leaving (23.18) as the only possible form for a 
section of /Cp. Assuming a is of the form (23.18), if / is not independent 
of p, then da will contain a nonzero term of the form dpj A dx i A • • • A dx n , 
leaving (23.19) as the only possible form for a polarized section of /Cp. ■ 

In Example 23.36, the polarized sections of /Cp are effectively just n- 
forms on the configuration space R". This conclusion is a special case of 
the following result. 

Proposition 23.37 If the leaf space 5 of P is a smooth manifold and a 
is a polarized section of Yip, then there exists a unique n-form a on S such 
that 

a = 

where q : N —>• S is the quotient map. Conversely, if fl is any n-form on 5, 
then a := q*{/3) is a polarized section of Yip. 

Proof. Suppose, first, that a = q*{/3), for an n-form (3 on 5. Then Xja = 0 
whenever X lies in P, since P is the kernel of g*. Furthermore, da = 
q*(dff) = 0, since /3 is an n-form on an n-manifold, showing that a is a 
polarized section of /Cp. 

In the other direction, we have already noted in the proof of Proposition 
23.26 that N can be identified locally with a neighborhood U x V of the 
origin R ra x R" in such a way that leaves of P correspond to the sets of the 
form {x} x V. We can use q to identify U = U x {0} with an open set U 
in 5. Thus, P looks locally just like the vertical polarization on R 2 ", and 
so, by Example 23.36, any polarized section a of /Cp will be of the form 
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(23.19). Thus, a determines an n-form a on U and a is the pullback of 
a by the projection map of U x V onto U. It follows that a is locally the 
pullback by q of an n-form a on U. We leave it to the reader to check that 
overlapping neighborhoods in N give the same form a on S and that the 
desired result holds globally. ■ 

Recall from Theorem 23.24 that Q pr e(/) preserves the space of polarized 
sections with respect to P, provided that the flow of Xf preserves P (which 
equals P, in this case). We now establish that for any such /, the Lie 
derivative Cx s preserves the space of polarized sections of /Cp. This result 
will eventually allow us to define a quantum operator Q{f) on the half-form 
Hilbert space associated to P. 

Proposition 23.38 Suppose X is a vector field, on N that preserves P, 
in the sense of Definition 23.22, and suppose a is a smooth section oflCp. 
Then the Lie derivative Cxa is another section oflCp and if a is polarized, 
Cxa is also polarized. 

Proof. Suppose Xi,.... X n are smooth vector fields, with X\ lying in 
P = P. Then, by a standard formula for the Lie derivative, 

{C x a){X 1 ,...,X n ) 


X(a( Xi,..., X n )) - a([X, X 1 ],X 2 ,. 

■ ■,*») 


Tl 

J2^(X 1 ,...,X J _ 1 ,[X,X J \,X J+1 ,.. 
j= 2 

.,x n ). 

(23.20) 


Now, because a is a section of /Cp, the first and third terms on the right- 
hand side of (23.20) vanish. Because X preserves P, [X, Xi] will again lie 
in P, and so the second term vanishes as well. Thus, Xi_i{Cxa) = 0, which 
means that Cxa is again a section of /Cp. 

Since Cxa = X_ida + d(Xsa), if a satisfies (23.17), we have 

d(Cxa) = d 2 (X_ia ) = 0, 
showing that a is again polarized. ■ 

Proposition 23.39 Suppose the leaf space S of P is a smooth manifold 
and that a vector field X on N preserves P. Then there exists a unique 
vector field Y on 2 such that 


q^ z {X) = Y (23.21) 

for all z € N. Furthermore, if a = q*{/3) is a polarized section of K,p, as 
in Proposition 23.37, then 


C x (q*m = q*(C Y m- 


(23.22) 
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That is to say, under the identification in Proposition 23.37 of polarized 
sections of Kp with n -forms on E, the operator Cx corresponds to the Lie 
derivative on S in the direction of Y. 

Proof. By Definition 23.22, [X, Z] lies in P whenever the vector held Z 
lies in P. Thus, if a function <p is constant along P (i.e., annihilated by 
every vector held Z lying in P), the same will be true of X<j>. Thus, if <p is 
of the form <p = ip ° q for some function ip on S, then X<p is of the form 
ipoq for some other function ip on S. The map ip H > ip is easily seen to be a 
vector held, that is, a derivation of C°°(S). We conclude, then, that there 
is a unique vector held Y on 3 such that 

X(ip oq) = (Yip) o q (23.23) 

for every smooth function ip on 5. It then follows from the definition of the 
differential that (23.21) holds for all z € N. From (23.21), it follows easily 
that for any n-form (3 on 2, we have 

X4q*((3))=q*(Y^). (23.24) 

Since /3, being a top-degree form, is closed, q*(fi) is also closed. Thus, one 
of the terms in the formula (21.7) for the Lie derivative of /3 and q*(0) is 
zero. Applying d to both sides of (23.24) then gives (23.22). ■ 

Given a vector held Y and a nowhere-vanishing n-form fj on E, let div^ Y 
be the unique function on E such that 

Cy(i 3) = (dfyg Y)/3. 

Then by (23.22), we have 

Cx(q*m = ((div/3 Y) o q)q*(/3). (23.25) 

The expression (23.25) will be helpful in analyzing the quantization of 
observables in Sect. 23.6.5. 

23.6.3 Square Roots of the Canonical Bundle 

We now assume that the leaf space S of P is an orientable manifold, and 
we choose on particular orientation of 5. 

Definition 23.40 Choose a nowhere-vanishing, oriented n-form ft on S, 
so that a := q*((3) is (Proposition 23.37) a nowhere-vanishing section of 
Kp. A section of Kp is non-negative if it is, at each point, a non-negative 
multiple of a. This notion does not depend on the choice of oriented n-form 

P- 


Since S is orientable, the canonical bundle Kp is trivializable, since the 
section a in Dehnition 23.40 is a globally trivializing section. Thus, we can 
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find a square root of ICp, that is, a line bundle Sp such that Sp ® Sp is 
isomorphic to ICp. (We may, for example, take Sp to be the trivial bundle.) 
When we speak of a square root of ICp, we will mean, more precisely, a 
bundle Sp together with a particular isomorphism of Sp ® Sp with ICp. 
Thus, if si and S 2 are sections of Sp, we think of Si ® S 2 as being a section 
of ICp. We assume, further, that the isomorphism of Sp <8> Sp with ICp is 
chosen so that for any section s of Sp, the section s ® s of ICp is non¬ 
negative. (If the initial isomorphism of Sp ®5p with ICp does not have this 
property, compose it with — I in the fibers of ICp.) 

We may consider the complexification of Sp, that is, the line bundle Sp 
whose fiber at each point is the complexification of the fiber of Sp. There 
is then a notion of complex conjugation for sections of Sp, which fixes the 
fiber of Sp inside the fiber of <5p at each point. If si and S 2 are sections of 
Sp, we think of si 0 S 2 as a section of the complexified canonical bundle 

K$. 

If a is a section of ICp and X is a vector field lying in P , let us define an 
n-form V ja by 


V.y ol = Xa (da). 


(23.26) 


Since a is a section of ICp, we have Xja = 0, which means that Vx« 
actually coincides with CxOt, by (21.7). Since it lies in P, the vector field 
X preserves P, and thus Vxa = Cxot is again a section of ICp, by Proposi¬ 
tion 23.38. The operator V in (23.26) has all the properties of a connection 
on ICp except that it is only defined in the directions of P. [Note that Cx 
does not, in general, satisfy the condition Cfx = f£x, as required by Def¬ 
inition 23.2. Since, however, Cxa. can also be computed as in (23.26), for 
any section a of ICp, the map V does satisfy V fx = /V.y-] 

We call V the natural partial connection on ICp. According to Defini¬ 
tion 23.35, a section a of ICp is polarized if and only if Vjq = 0 for each 
vector field X lying in P. We now show that both the partial connection 
and the Lie derivative “descend” to sections of (Sp in a natural way. This 
result will, in particular, allow us to define a notion of polarized sections 
of Sp. 

Proposition 23.41 Let Sp be a fixed square root of ICp. For any vector 
field X lying in P, there is a unique linear operator V.y mapping sections 
of Sp to sections of Sp, such that 


Vx(M) =X(f) Sl +fV x si 
V. Y (si ® s 2 ) = (V.ysi) ® s 2 + si ® (Vxs 2 ) 


(23.27) 

(23.28) 


for all smooth functions f and all sections si and s 2 of Sp. On the left-hand 
side of (23.28), V.y is the partial connection on ICp given by (23.26). 
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If X is a vector field on N that preserves P, then there is a unique linear 
operator C x , mapping sections of Sp to sections of 5p such that 

£x(fsi) = X(f)si + fC x si 
C x {si 0 s 2 ) = (C x s i) 0 s 2 + si 0 (£xs 2 ) 

for all smooth functions f and all sections Si and s 2 of Sp. 

Both of these constructions extend naturally from sections of Sp to sec¬ 
tions of Sp. 

We may then say that a section s of Sp is polarized if Vas = 0 for every 
smooth vector field X lying in P. 

Proof. If V is a one-dimensional vector space, then the map 0 : V x V — > 
V g V is commutative: u 0v = v®n for all u, v € V. Furthermore, if uo is a 
nonzero element of V, then the map u i—>• u g) Uq is an invertible linear map 
of V to V ® V. Suppose s o is a local nonvanishing section of Sp. Applying 
(23.28) with = s 2 = so, we want 

2(Vaso) g> so = Vx(so 0 so)- (23.29) 

Since the operation of tensoring with s 0 is invertible, there is a unique 
section “VjSo” of Sp for which (23.29) holds. 

Locally, any section s of Sp can be written as s = gso for a unique 
function g. We then define \7 x s by 

V x s = X(g)s 0 +gX x s 0 , (23.30) 

in which case, (23.27) is easily seen to hold. If si = g\ so and s 2 = g 2 So, 
then using (23.29) and the symmetry of the tensor product, it is easy to 
verify that (23.28) holds, with both sides of the equation equal to 

X{gig 2 )X x {so 0 so). 

Uniqueness of Vx holds because both (23.29) and (23.30) are required 
by the definition of Vx- The action of Vx extends to sections of Sp, by 
writing such sections as complex-valued functions times So- The analysis of 
the Lie derivative is similar and is omitted. ■ 

23.6.4 The Half-Form Hilbert Space 

We continue to assume that the leaf space S of P is an orientable manifold, 
and that we have chosen an orientation on 5. We assume that we have 
chosen a square root Sp of ICp, as in Sect. 23.6.3. If L is a prequantum line 
bundle over N, we now form the tensor product bundle L 0 Sp. Given two 
sections Si and s 2 of L 0 Sp, we decompose them locally as Sj = fij 0 Vj, 
where p,j is a section of L and Vj is a section of Sp, and where, say, the 
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/ij's are taken to be nonvanishing. Then we can combine these sections to 
form the quantity 

(si, s 2 ) := (pi, Hi)v\ ® V 2 , (23.31) 

where (pi, P 2 ) is the pointwise inner product given by the Hermitian struc¬ 
ture on L. Since (pi, P 2 ) is a scalar-valued function and FT® V 2 is a section 
of /Cp, the quantity (si,S 2 ) is a section of /Cp. Any other decomposition 
of Sj as the tensor product of a nonvanishing section of a L and a section 
of Sp is of the form (fpj) ® (vj/f) for some nonvanishing function /, and 
the value of (si,S 2 ) is the same as for the original decomposition. Since 
it is independent of the choice of local decomposition, (si,S 2 ) is actually 
defined globally. 

Given the connection on L and the partial connection (23.41) on <5p, we 
can form a partial connection on L ® Sp with the following property. For 
any vector held X lying in P, and any section s of L ® Sp, if we decompose 
s locally as s = p ® v, where p is a nonvanishing section of L and v is a 
section of Sp, then 


V.y(s) = (VxM) ® v + P ® (Vxi/). (23.32) 

The reader may verify that if p ® v is replaced by (fp) ® (u/f) for some 
nonvanishing function /, the value of Vx(s) is unchanged. Thus, as with 
the quantity (si,S 2 ) hr (23.31), V.v(s) is defined globally. We then define 
a section s of L ® 6p to be polarized if V„\ -s = 0 for each vector held X 
lying in P. If Si and S 2 are polarized sections of L ® dp, then the section 
(si, S 2 ) in (23.31) is easily seen to be a polarized section of /Cp. 

As in the case without half-forms there is an obstruction to the existence 
of globally dehned polarized sections of L®(5p. We say that a leaf R is Bohr- 
Sommerfeld (in the half-form sense, with respect to a particular choice of 
Sp) if there exists a nonzero section s of L ® Sp dehned over R such that 
V_y s = 0 for each tangent vector to R. As in the case without half-forms, 
if the leaves are topologically nontrivial, the Bohr-Sommerfeld leaves will 
in general be a discrete set in the space of all leaves. 

The Bohr-Sommerfeld leaves in the half-form sense need not be the same 
as the Bohr-Sommerfeld leaves in the sense of Dehnition 23.27. In the 
setting of Example 23.29, for instance, the canonical bundle /Cp is trivial, 
but the square-root bundle Sp may be chosen to be nontrivial, by putting 
in a twist by 180 degrees over each copy of S 1 . (That is to say, we think 
of S 1 as the interval [0, 27 t] with the ends identified, and we attach a copy 
of K to each point. But when identifying the fiber at 27 t with the fiber at 
0, we use the negative of the identity map.) As Exercise 9 shows, in this 
example, the Bohr-Sommerfeld leaves are the sets of the form {a;} x 5 1 , 
where x/h = n + 1/2 for some integer n. 

Definition 23.42 For any purely real polarization P and any square root 
Sp of YCp , the half-form space is the space of smooth, polarized sections 
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of L ® 6p. For a polarized section s of L ® Sp, define the norm of s by 

IN| 2 =/M, (23.33) 

where (s, s) is as in (23.31) and where (s,s) is the n-form on S given by 
Proposition 23.37. If si and S 2 are elements of the half-form space with 
||si || < oo and ||S 2 1| < oo, define the inner product of s i and S 2 by 

(si,s 2 ) = (si,s 2 ). 

The half-form Hilbert space is the completion with respect to the norm 
(23.33) of the space of polarized sections s for which ||s|| 2 < oo. 

The integral of n-forms on S is taken with respect to the chosen orien¬ 
tation on E. We can always decompose s locally as s = p®v with v being 
a section of Sp (as opposed to Sp ) and p being a section of L. Then 

(s, s) = (p,p)v®p, 

from which we see that (s, s) is a non-negative section of /Cp (Defini¬ 
tion 23.40). (Recall that we have chosen the identification of Sp ® Sp with 
/Cp in a particular way, so that v ® v is always the pullback by q of an 
oriented form on S.) Thus, the integral on the right-hand side of (23.33) is 
non-negative, but possibly infinite. 

Example 23.43 Let N = T*R = R 2 and let L be the trivial bundle on 
N , with connection Vx = X — ( i/h)8(X ), where 9 — p dx. Let P be the 
vertical polarization on N and orient R so that oriented 1-forms are positive 
multiples of dx. LetSp to be the trivial bundle and with a trivializing section 
“Vdx” of 5p such that Vdx ® Vdx = dx. Then every polarized section s of 
L (g) Sp has the form 

s = V>( x ) ® Vdx (23.34) 

for some function if on R. The norm of such a section is computed as 

INI 2 = [ W x )\ 2 dx - 

J R 

Proof. The sections of /Cp are 1-forms that are zero on d/dp, that is, 
1-forms of the form a = f(x,p) dx. Such a 1-form satisfies da = 0 if 
and only if / is independent of p. Thus, dx is a globally defined polarized 
section of /Cp. If we choose Sp to be trivial and let Vdx be such that 
y/dx< g> Vdx = dx, then Vdx will be a polarized section of dp. Every section 
s of L ® Sp can be written uniquely as s = %f(x,p) ® Vdx for some function 
if. Since Vdx is polarized and 9(d/dp) = 0, we see that s is polarized if 
and only if tp is independent of p. For a section of the form (23.34), we have 
( s,s) = \ip(x)\ 2 dx, in which case, (s, s ) is given by the same formula as 
( s, s), but now interpreted as a 1-form on H = R rather than R 2 . ■ 
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23.6.5 Quantization of Observables 

Suppose / is a function on N for which Xf preserves P in the sense of 
Definition 23.22. We will now associate with / a self-adjoint (or, at least, 
symmetric) operator Q(f) on the half-form Hilbert space of P. Operators 
of this sort will satisfy exactly the desired commutation relations. 

Definition 23.44 For any function f on N for which Xf preserves P, let 
Q(f) be the operator on the half-form space of P given by 

Q{f)s = (Qpre(f)n) ® v + ih n ® c Xj v, 

where s is decomposed locally as s = fi(£)v, with p, being a section of L and 
v a section of dp. 

The operator Q(f) is well defined (i.e., independent of the choice of local 
trivialization) as may easily be verified. This independence holds, however, 
only because the coefficient ih of 57x f in the first term exactly matches the 
coefficient ih of C Xf in the second term. 

Before describing the general properties of the operators Q(f), we con¬ 
sider a simple example that illustrates the essential role of the Lie derivative 
term in Definition 23.44. 

Example 23.45 Let the notation be as in Example 23-43, and let / : R 2 — 
R be of the form 

f{x,p) = a(x) +b(x)p , 

for some smooth functions a and b on R. Then Xf preserves P and 
Q(f)(ip(x) <g) Vdx) = f{x) (g> Vdx, 

where 

f{x) = —ih ^ b(x)if>'(x ) + ^ b'(x)ip(x)j + a(x)ip(x). 

In particular, if f[x,p) = x, then if{x) = xif(x) and if f(x,p) = p , then 
if(x) = —ih df/dx. More generally, if a and b are polynomials, then the 
action of Q(f) on ip coincides with the Weyl quantization of / (Exercise 8 
in Chap. 13). 

The term involving b'{x) comes from the presence of half-forms and is 
absent in the formula (22.15) for Q pre (f). The b' term, with the exact 
coefficient of 1/2, is necessary for Q(f) to be self-adjoint (or, at least, 
symmetric); see Exercise 10. Example 23.45 is actually quite representative 
of the general case. [Compare (23.38) in the proof of Theorem 23.47 and 
Example 23.48.] 

Proof. We have computed Q pre (f) in (22.15) in the proof of Proposi¬ 
tion 22.12. We compute that Xf is equal to —b(x) d/dx plus a term in¬ 
volving d/dp. Since the 1-form dx is closed, we obtain, by (21.7), 

Cx f (dx) = d{Xf_\dx) = —db{x) = —b'{x) dx. 
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Using Proposition 23.41, we then obtain 

Cx f (Vdx ^ ® Vdx = — 7 ^b'(x) dx = — ^b'(x)Vdx ® yfdx, (23.35) 

which gives 



Adding the Cx f term to the previously computed expression for Q pie {f) 
gives the desired result. ■ 

Returning now to the setting of general real polarizations, we establish 
two key results for the quantized observables Q(f ), that they satisfy the 
desired commutation relations and that they are self-adjoint (or, at least, 
symmetric) whenever / is real valued. It can also be shown that when / is 
a polarized function (i.e., constant along each leaf of P), then Q(f) acts on 
the quantum Hilbert space simply as multiplication by /. See Exercise 11. 

Theorem 23.46 Suppose f and g are functions on N for which Xf and 
X g preserve P. Then the operators Q{f) and Q(g) satisfy 

^{Q(f),Q(g)} = Q({f,9}) 

on the space of smooth, polarized sections of L ® Sp. 

Proof. Since Q{h) is a local operator for any function h, it suffices to prove 
the result locally. Let us choose, then, a local nonvanishing section vq of 
6p, so that, locally, each section s of L®Sp can be decomposed uniquely as 
s = p, 0 iz 0 . For any vector field preserving P, we let "f(X) be the function 
such that 

£x{vo) = l{X)v 0 . 

We then have Q{f){n ® z'o) = ft ® vq, where 

A = [Q P re(/) + ihri{X f )]ii. 

We now compute that 

[<3pre(/) + ih'y(Xf), Qpre(g) + ihj(X g )\ 

= [Qpre(f), Qpre(ff)] + ^[Qpre(/), T{Xg)\ + ^{Xg), Q pr e(/)] 

= ihQ pre ({f,g}) + ( ih ) 2 (X f (-y(X g )) - X g {^(X f ))). 

The desired result will follow if we can verify that 

Xfh(Xg)) - Xg( 7 (Xf)) = 7 (X {Lg} ). (23.36) 

To verify (23.36), we use a standard identity for the Lie derivative on 
forms: £[x,y] = [Cx,£y]- Using Proposition 23.41, we can easily show that 
this identity holds also on sections of Sp, for vector fields that preserve P. 
It is then a simple calculation (Exercise 12) to verify (23.36). ■ 
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Theorem 23.47 If f £ C°°(N) is real valued and Xf preserves P, then 
the operator Q(f ) is symmetric on the space of smooth sections s in the 
half-form space for which (s, s) has compact support on 3. 

Proof. Suppose a = q*(/3) is polarized section of /Cp, so that there is, 
at least locally, a corresponding polarized section y / g*( / 3) of dp. If Xf 
preserves P, then by Proposition 23.39, there is a unique vector field Yf on 3 
such that q* :Z (Xf) = Yf for all z € N. Using (23.25) and Proposition 23.41, 
we get 

£x, (vV(/3)) = ^((div^ Yf) o q)\Jq*(fd). 

Meanwhile, it is not hard to show (Exercise 13) that it is possible to 
choose a local symplectic potential 9 that is zero in the directions of P. 
Thus, we can trivialize L locally in such a way that sections that are co- 
variantly constant along P are simply functions that are constant along P 
in the ordinary sense. Thus, elements s of the half-form space have, locally, 
the form 

s=(ifoq)<® y/ q*(/3) (23.37) 

for some function if and n-form /? on 3. Thus, if Xf preserves P, and a 
section s is decomposed locally as in (23.37), we have 

Q(f)(s) = ( 4 >oq)<g, vV(/3), 

where 

ff = ih (d r f if + ^(div^ Y f )ipj + ( -0(X f ) - f)if. (23.38) 

It can be verified (Exercise 14) that the function —6(Xf) — f is constant 
along P and thus may be thought of as a function on 3. 

By multiplying elements of the half-form space by functions of the form 
X°q, with x having compact support in 3, we can “localize” the calculations 
on 3. Suppose Si and S 2 are two elements of the half-form space decomposed 
as in (23.37) near a point z G N, with the same /3 and two different functions 

ifi and if 2 on 5. Then (si, S 2 ) has the form ifiif 2 p in a neighborhood U of 

q{z). By localization, we may assume that (si, S 2 ) has compact support in 
U, and we then have 

(si,Q(/)s 2 ) = -ih J ifiif2 /3, 

where if 2 is as in (23.38). “Integration by parts” (Exercise 15) with respect 
to /3 then shows that this quantity coincides with (Q(f)s 1 , S 2 ) . ■ 

Example 23.48 (Cotangent Bundles) Let N = T*M for an oriented 
manifold M, let 6 he the canonical 1-form on N, and let L be the trivial 
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line bundle on N, with connection Vx = X — ( i/h)0(X ). Let P be the 
vertical polarization on N, so that K-p is trivial, and let Sp be chosen to 
be trivial. Let j3 be an arbitrary nowhere-vanishing, oriented n-forrn on M , 
so that a := 7r*(/3) is a nowhere-vanishing section of ICp, and choose a 
trivializing section y/a of Sp with y/a Cg> y/a = a. In that case, elements s 
of the half-form Hilbert space have the form s = {if o ir) 0 y/a, where ip is 
a function on M , and 



The half-form Hilbert space may, thus, be identified with L 2 (M,j3). 

Suppose now that f is a function on T*M of the form f = /i + / 2 , where 
fi is constant on each fiber of T*M and f 2 is linear on each fiber. Then 
f 2 may be thought of as a section ofT**M = TM, that is, as a vector field 
Yf on M. In that case, Xf preserves P and Q(f) acts on elements of the 
half-forms space as 


Q(f) ((ip O 7r) (g) y/a) =('0 0 7r) <g) y/a, 


where 



Here div^ Yf is the unique function such that Cy f (3 = (div^ Yf)/3. 

A simple calculation in coordinates shows that the vector field Yf in the 
example satisfies Xf(ip o n) = (Yfip) o n, so that our notation is consistent 
with that in Proposition 23.39 [see (23.23)]. 

Proof. The calculation is precisely the same as in the proof of Theorem 
23.47, except that the decomposition in (23.37) is now global. The claimed 
form of Q(f) is nothing but the expression (23.38), where the reader may 
easily compute, using local coordinates, that —9(Xf) — f = fi- ■ 

It is an unfortunate feature of geometric quantization that in the case 
of the vertical polarization on cotangent bundles, it only permits us to 
quantize functions that are at most linear in the momentum variables. In 
a typical physical system having T*M as its phase space, there will be a 
“kinetic energy” term in the classical Hamiltonian that is quadratic in p. 
To quantize such a system, one has to find a way to quantize the kinetic 
energy term, “by hook or by crook.” 

One approach to this problem is to allow the exponentiated quantized 
Hamiltonian to change the polarization, and then to use pairing maps 
(Sect. 23.8) to “project” back to the Hilbert space for the original polar¬ 
ization. As explained in Sect. 9.7 of [45], this approach succeeds in the 
case that the kinetic energy term is g(p,p)/(2m), where g is the Rieman- 
nian structure on T*M induced by a Riemannian structure on TM. The 
quantized kinetic energy operator turns out to be given by the map 



(23.39) 
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where A is the Laplacian for M (taken to be a negative operator) and 
where R(x) is the scalar curvature of the Riemannian structure on TM. 
The calculation in [45] glosses over one technical issue, which is that the 
time-evolved polarizations may not be everywhere transverse to the original 
polarization. Nevertheless, the calculation provides a reasonable geometric 
motivation for the formula (23.39). 

It should be emphasized that, because of the projections involved in 
the computation of the quantized kinetic energy operator, it does not sat¬ 
isfy the desired commutation relations with the quantizations of functions 
whose flow preserves the vertical polarization. Nevertheless, this approach 
to quantizing the kinetic energy may simply be the best one can do. 


23.7 Quantization with Half-Forms: The 
Complex Case 

In the case of a purely complex polarization, half-forms are not “neces¬ 
sary,” in that we typically have a nonzero Hilbert space even without them. 
Nevertheless, their inclusion gives advantages. In the first place, using half¬ 
forms makes the complex case more parallel to the real case. In the second 
place, complex quantization with half-forms simply gives better results than 
without half-forms. In the case of the harmonic oscillator, for example, the 
inclusion of half-forms allows (Example 23.53) geometric quantization to 
reproduce precisely the spectrum (n+\/2)hu>, n = 0,1,2,..., that we found 
in the traditional treatment. This result should be compared to Proposition 
22.14 without half-forms, where the spectrum is found to be ntuo. 

Throughout this section, we assume that (N,ui) is a 2n-dimensional 
quantizable symplectic manifold, that (L,V) is prequantum line bundle 
over N , and that P is a Kahler polarization on N (Definition 23.19). Since 
the definitions in the complex case are very similar to those in the real 
case (with a few important differences), we will run through them quickly. 
Since P is no longer equal to P, we need to replace P by P in may of the 
formulas from Sect. 23.6. 

The canonical bundle K-p of P is the complex line bundle for which the 
sections are n-forms a satisfying 


Xja 

for each vector field X lying in P. Sections of KLp are precisely the (n, In¬ 
forms on N. A section of KLp is said to be polarized if 

Ij (da) = 0 (23.40) 

for every vector field lying in P, or, equivalently, if da = 0. Polarized 
sections of ICp are precisely the holomorphic (n, 0)-forms on N. By a square 
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root of ICp we will mean a complex line bundle Sp over N such that Sp®Sp 
is isomorphic with ICp, together with a particular isomorphism of Sp <S> Sp 
with ICp. Thus, if si and S 2 are sections of dp, we think of si ® S 2 as being 
a section of ICp. We assume that such a square root exists and we fix for 
the remainder of this section one particular square root Sp. 

If X is a vector field that preserves P, in the sense of Definition 23.22, 
then Cx preserves the space of sections of ICp and also the space of po¬ 
larized sections of ICp. The condition (23.40) defining polarized sections of 
ICp can be understood as the vanishing of a partial connection V., defined 
for vector fields lying in P, and given by V^a = X_i(da). Both the partial 
connection (for vector fields lying in P) and the Lie derivative (for vector 
fields preserving P) descend from ICp to Sp, as in Proposition 23.41 in the 
real case. The connection on L and the partial connection on Sp combine 
to give a partial connection on L® Sp. A section s of L ® Sp is said to be 
polarized if Vxs = 0 for all vector fields X lying in P. 

Notation 23.49 If /3 is any 2n-form on N, let the expression 

P 

A 

denote the unique function on N such that /3 = (/3/A)A, where A is the 
Liouville form in Definition 21.16. 

Unlike the canonical bundle in the real case, the canonical bundle in the 
purely complex case carries a natural Hermitian structure. 

Proposition 23.50 If a is an ( n,0)-form on N, then at each point the 
2n-form 

(_l)"(»-i)/ 2 (_i)» a A a 

is a non-negative multiple of the Liouville form A. There is then a unique 
Hermitian structure on Sp with the property that for each section s of Sp 
we have 


| S | 2 = 


(!)"(« 1 )/ 2 (— i) n (s ® s) A (s 



1/2 


(23.41) 


The factor of 2” in the denominator in (23.41) is inserted for convenience, 
to make certain formulas come out more nicely. 

Proof. See Exercise 17. ■ 

Since, by assumption, there is Hermitian structure on L, the above Her¬ 
mitian structure on Sp gives rise in a natural way to a Hermitian structure 
on L 0 Sp. 

Definition 23.51 The half-form Hilbert space for a Kahler polariza¬ 
tion P on N is the space of square-integrable polarized sections of L ® Sp. 
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In the C n case, using the canonical 1-form as our symplectic potential, 
elements of the half-form Hilbert space take the form 

e -|Imz| 2 /(2 aS) f ( 2 ) g, A . . . /\ dz n . 

In this special case, the norm of the half-form factor \Jdz\ A • • • A dz n is 
constant and the half-form Hilbert space is still identifiable with the space 
in Conclusion 22.10. In the case of the unit disk, on the other hand, the 
presence of half-forms alters the inner product; see Exercise 16. 

We now define quantized observables on the half-form Hilbert space, 
using the same formula as in the real case. 

Definition 23.52 If f is a function on N for which Xf preserves P, let 
Q(f) be the operator on the half-form Hilbert space of P given by 

Q(f)s = (Qpre(f)p) -ih p® C Xf v, 

where s is decomposed locally as s = p ® v , with p being a section of L and 
v a section of dp. 

These operators satisfy [Q(f) : Q{g)\ /(ih) = Q({f,g}) on the space of 
smooth polarized sections of L ® dp, with the proof of this result being 
identical to the proof of Theorem 23.46 in the real case. If / is real-valued 
and Xf preserves P, then Q(f) will be at least symmetric, assuming we can 
find a dense subspace of the half-form Hilbert space consisting of “nice” 
functions. (Finding dense subspaces is more difficult in the holomorphic 
case than in the real case.) A proof of this claim is sketched in Exercise 18. 

Example 23.53 Consider K 2 = T*R with the Kabler polarization P given 
by the global complex coordinate z = (x — ip/(mcj)), for some positive 
number oj. Take dp to be trivial with trivializing section \fdz. Consider 
also the harmonic oscillator Hamiltonian H := (p 2 + (mojx) 2 ) / (2m). Then 
X'h preserves the P and the operator Q(H) on the half-form Hilbert space 
has spectrum consisting of numbers of the form (n + 1/2 )hu>, where n = 
0,1,2,.... 

In this example, w is the frequency of the oscillator and not the canonical 
2-form. 

Proof. The calculation is the same as in the proof of Proposition 22.14, 
except for the addition of the Lie derivative term. A simple calculation 
shows that Cx H (dz) = iui dz, from which it follows that £jv H \/cfe = 
(icu/2)y/cfz. It is then easy to see that the set of elements of the form 
e -muj\imz\ /( 2 h) z n g, f orm an orthonormal basis of eigenvectors for 

Q(H), with eigenvalues (n + \/2)hu. m 
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Pairing maps are designed to allow us to compare the results of quantizing 
with respect to two different polarizations. We consider mainly the case 
of two “transverse” real polarizations; the case of two complex polariza¬ 
tions or one real and one complex polarization can be treated with minor 
modifications. 

Suppose that P and P' are two purely real polarizations and that the 
associated leaf spaces Si and S 2 are oriented manifolds. Suppose also that 
P and P' are transverse at each point z £ N, meaning that P z D Pi = 
{0}. If a and (5 are polarized sections of K.p and /Cp/, respectively, the 
transversality assumption is easily shown to imply that a A /3 is a nowhere- 
vanishing 2n-form on N. Thus, for any point z £ N, we can define a bilinear 
“pairing” from Sp^ z x <5pq 2 —> R. by 



(23.42) 


(Recall Notation 23.49.) We can extend this pairing to a pairing Sp z x 
<5 P , z —>■ C that is conjugate linear in the first factor and linear in the second 
factor. Finally, we extend to a pairing of (L z ®6p z ) x (L z 0 dp, z ) —> C by 
setting equal to (pi, /i 2 )(iq, ^ 2 ), where (p 1 , 112 ) is computed 

with respect to the Hermitian structure on L. 

Let Hi and H 2 denote the half-form Hilbert spaces for P and P', re¬ 
spectively. Given si £ Hi and s 2 £ H 2 , we define the pairing of si and 
s 2 by 



provided that the integral is absolutely convergent. Here (si,s 2 ) is the 
pointwise pairing of si and s 2 defined in the previous paragraph and c is 
a certain “universal” constant, depending only on h and the dimension of 
n, that can be chosen to make certain examples work out nicely. We now 
look for a pairing map Appi : Hi — > H 2 with the property that 


(si,s 2 ) PP , = (A p,p'Si, s 2 ) H2 . 


(23.43) 


If the pairing is bounded (i.e., it satisfies |(si,s 2 ) pp , | < C||si|| ||s 2 || for 
some constant C), there is a unique bounded operator A pp: satisfying 
(23.43). Even if the pairing is unbounded, we may be able to define A p pt 
as an unbounded operator. 

If we were optimistic, we might hope that the pairing map for any two 
transverse polarizations would be unitary, or at least a constant multiple 
of a unitary map. If this were the case, it would suggest that quantization 
is independent of the choice of polarization, in the sense that there would 
be a natural unitary map between the Hilbert spaces for two different 
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polarizations. As it turns out, however, the typical pairing map is not a 
constant multiple of a unitary map. Nevertheless, there are certain special 
cases where the pairing map is unitary (up to a constant), including the case 
of translation-invariant polarizations on R 2ra . See also [20] for an example of 
a pairing map between a real and a complex polarization that is a constant 
multiple of a unitary map. 

We compute just one very special case of the pairing map between two 
real polarizations. 

Example 23.54 Consider N = K 2 = T*R. and take L to be trivial with 
connection 1-form 9 = p dx. Let P be the vertical polarization, spanned at 
each point by d/dp , and let P' be the horizontal polarization, spanned at 
each point by d/dx. Then elements si of the half-form space for P have the 
form 

s\{x,p) = <p(x) ® Vdx (23.44) 

and elements S 2 of the half-form space for P' have the form 

S 2 (x,p) = ip{l p)e“P/ R ® sfdp, (23.45) 

where <p and ip are functions on K. If c = 1, the pairing is computed as 

(si,S 2 ) P p> = — [ (p(x)ip(p)e' ,xp ^ h dx dp. (23.46) 

J R 2 

If s 1 has the form (23. ff), then Ap i p/(si) has the form (23. 45), where 
ip(p) = - [ p(x)e~ ixplh dx. 

J R 

Thus, A.p p' is a scaled version of the Fourier transform and is, in partic¬ 
ular, a constant multiple of a unitary map. 

The pairing should be defined initially on some dense subspace of the 
Hilbert spaces, such as the subspaces where <p and ip are Schwartz func¬ 
tions. The pairing map can also be defined initially on the Schwartz space, 
recognized as being unitary (up to a constant), and then extended by con¬ 
tinuity to all of Hi. Once the pairing map is extended to Hi, the pairing 
itself can be defined for all Si G Hi and S 2 6 H 2 by taking (23.43) as the 
definition of (si,S 2 ) PP i ■ Even though it is possible, as just described, to 
extend the pairing to all of Hi x H 2 , the integral in (23.46) is not always 
absolutely convergent. 

Proof. The forms (23.44) and (23.45) are obtained by a simple modification 
of the argument in the proof of Proposition 22.8. We can compute that the 
pointwise pairing of Vdx and Vdp is —1, which gives the indicated form of 
the pairing in (23.46). The pairing may be rewritten as 


(p{x)e~ ixp / h dx ip(p) dp, 


which gives the indicated form of the pairing map. ■ 
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23.9 Exercises 

1. Let L be a line bundle with connection V over N. Let s be a section of L 
and let X\ and X 2 be two vector fields on N such that X\ (z) = X 2 (z) 
for some fixed point z £ N. Show that 

Va'i(s)E) = Vx 2 (s)(z). 

Hint : Use the assumption that V/x = /Vj. 

2. Let L be a Hermitian line bundle with Hermitian connection V and 
let so be a locally defined section of L such that (so, so) = 1. Given a 
vector field X, let 0(X) be the unique function such that 

VxSo = -iO{X)so- 

Show that 9{X) is real valued. 

Hint: Use the Hermitian property of the connection. 

3. Consider the definition of the curvature 2-form oj(X 7 Y) in Defini¬ 
tion 23.4. 

(a) Show that the expression for to is C°°-linear in each of the vari¬ 
ables X, Y, and s. That is to say, show that for all smooth 
functions /, we have t o(fX,Y)s = fu(X,Y)s, and similarly for 
the variables Y and s. 

(b) Show that the value of c u(X,Y)s at a point z depends only on 
the values of X, Y, and s at the point z. 

(c) Show that the value of w(X, Y) at a point z does not depend on 
the value of s at z, provided that s(z) ^ 0. 

4. Consider the symplectic form w = dpAdx on R 2 . Define a purely com¬ 
plex polarization on R 2 by taking P z to be the span of the vector d/dz 
in (22.9), for some fixed a > 0. Show that P is a Kahler polarization. 

5. Let P be the polarization on R 2 in Exercise 4. Show that the function 
n(x,p) := ap 2 is a Kahler potential for P. 

6. Suppose that w is a J-invariant 2-form on a complex manifold N. Show 
that w is a (1, l)-form. (Recall the definitions preceding Lemma 23.34.) 

Hint: Write ui = oj 1 +w 2 , where w 1 is a (1, l)-form and w 2 is a sum of 
a (2,0)-form and a (0,2)-form. Show that 

w 2 (JX, JY) = -u; 2 (X,Y) 

for all tangent vectors X and Y. 
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7. Suppose that k is a smooth, real-valued function on a complex mani¬ 
fold N. Show that the 2-form iddn is a real-valued 2-form. 

8. In Example 23.30, verify that 8 is a symplectic potential for w, and 
compute 8(d/dz ), where, with z = x — iy, we have d/dz = ( d/dx — 
id/dy)/ 2. Then verify that so(*0 := (1 — l^ 2 ) 1 ^ satisfies Va/g 2 so = 0 
and thus constitutes a global trivializing holomorphic section. 

9. Consider the situation in Example 23.29. Show that the canonical bun¬ 
dle for P is trivial, with trivializing section dx. Let Sp be the (non¬ 
trivial) bundle described in the paragraph preceding Definition 23.42. 
Since the tensor product of any real line bundle with itself is trivial, 
Sp <g) Sp is isomorphic to ICp. Let \fdcx denote a discontinuous section 
defined over the set 0 < <j> < such that \fdx®\fdx = dx. Show that 
\7x{dx) = 0 and VxVdx = 0 for every vector field lying in P. Now 
show that the Bohr-Sommerfeld leaves (in the half-form sense, for this 
choice of Sp) are the sets of the form {x} x S 1 , where x/h = n + 1/2 
for some integer n. 

10. Let 6 be a smooth, real-valued function on M and let c be a real 
constant. Show that an operator of the form 

ip h* — ih (b(x)ip'(x) + cb'(x)ip(x)) 
is symmetric on C£°(R) C L 2 (M) if and only if c= 1/2. 

11. Let P be a real polarization and let / be a smooth polarized function 
on N, that is, one for which derivatives in the direction of P are 
zero. Show that Q{f) acts on the half-form Hilbert space simply as 
multiplication by /. (Compare Proposition 23.25 in the case without 
half-forms.) 

Hint: Show that C-x f ot = 0 whenever a is a polarized section oi ICp. 

12. Using the identities C\x,Y] = [Cx,£y] and ^{/, 3 } = [Xf,X g ], verify 
the identity (23.36). 

13. Prove that if P is a real polarization on N, it is possible to choose a 
symplectic potential 8 locally in such a way that 8 is zero on P. 

Hint: Use functions / x as in the proof of Proposition 23.26. 

14. Suppose that P is a purely real polarization on N and 8 is a local 
symplectic potential that vanishes on P. Suppose also that / is a real¬ 
valued function for which Xf preserves P. Show that the function 
—0(Xf) — f is constant along the leaves of P. 

Hint: If X is a vector field lying in P , use (21.6) to show that X(8(Xf)) = 

d8(X,X f ). 
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15. Suppose that /? is a nowhere vanishing n-form on an oriented manifold 
5, that X is a real vector field on S, and that (p and ip are smooth, 
compactly supported functions on 3. Verify the following formula for 
“integration by parts”: 

P = ~f u ^ H>( div/? V) p, 

where div^ X is the function such that CxP = {divp X)p. 

Hint: If is the flow generated by X , then for all sufficiently small 
t. $ t (x) is defined for all x in the support of (pip and the integral of 
($ t)*(<frip/3 ) over 3 is independent of t. 

16. Let the notation be as in Exercise 8. Then the canonical bundle for 
P is trivial, with trivializing section dz. Take Sp to be trivial, with 
trivializing section y/dz. Show that every polarized section s of L®8p 
is of the form 

s = F(z)s 0 (z) <B> Vdz , 

where F is holomorphic. Show that the norm of such a section is, up 
to a constant, the L 2 norm of F with respect to a measure of the form 
(1 — |-s| 2 ) y , but that the value of v is not the same as when half-forms 
are not included. 

17. Let P be a Kahler polarization on N, let Zi,..., z n be holomorphic 
local coordinates on N, and let A be the matrix given by 

Ajk=U} (d^’d^)' 

(a) Show that the matrix iA is positive definite. 

(b) Show that lo = Aj/. dzj A dzk- 

(c) Show that the quantity cj®”/n! may be computed as 

det(LA)(—l)"( n-1 )/ 2 (— i) n dz\ A • • • A dz n A dz\ A • • • A dz n . 

(d) Verify Proposition 23.50. 

18. Let P be a Kahler polarization on N, let dp be a fixed square root of 
/Cp, and let / be a smooth, real-valued function such that Xf preserves 
P. Throughout this problem, if si and S 2 are local sections of a line 
bundle, with S 2 nonvanishing, s\/s 2 will denote the unique function 
such that Si = (si/s 2 )s 2 - 

(a) Show that for any continuous compactly supported function ip 
on N, we have 
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Hint: Use Liouville’s theorem. 

Note: The same result holds if is not compactly supported but 
is “sufficiently nice.” 

(b) If v is a local nonvanishing section of 8p, show that 


C Xf v _ 1 C Xf 0 0 v) 


v 2 v ®v 


(c) If a is any 2n-form on N, show that 



(d) Suppose si and S 2 are polarized sections of L0i5p, decomposed 
locally as Sj = fij 0 Vj, j = 1, 2. Show that 

iXf(si,s 2 ) = (i(V Xf n i) 0 i/i, s 2 ) + (*> l 0 {C Xf vi) 0s 2 ) 


+ (si,*(VaVU) ® v 2 ) + (Sl,Ui 2 ® (£ Xf v 2 )) 


where (•, •) is computed with respect to the Hermitian structure 
on L 0 Sp described in Sect. 23.7. 

Hint: Use the identity C Xf (a A /?) = (C Xf a) A f3 + a A (C Xf /3). 

(e) Suppose si and s 2 are polarized sections of L 0 5p belonging to 
the domain of Q(f) and such that (si,s 2 ) is “sufficiently nice.” 
Show that 


(si,Q(/)s 2 ) = (Q(/)si,s 2 ) . 
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A. l Tensor Products of Vector Spaces 

Given two vector spaces Vi and V 2 over C, the tensor product is a new vector 
space V\ 0^2, together with a bilinear “product” map ® : V/ x V 2 V\ 0 V 2 ■ 
If V\ and V 2 are finite dimensional with bases {uj} and {%}, then V\ ® V 2 
is finite dimensional with {uj®Vk} forming a basis for V\ ® 14- In the finite¬ 
dimensional case, we could simply define the tensor product by this basis 
property, but then we would have to worry about whether the construction 
is basis independent. Instead, we define V\ ® V 2 by a “universal property.” 

Definition A.l Suppose V\ and V 2 are vector spaces over a field F. Then 
a tensor product of V\ and V 2 is a vector space W over F together with 
a bilinear map T : V± x V 2 — > W having the following “universal property”: 
If U is any vector space over F and $ : V± x V 2 —> U is a bilinear map, 
then there exists a unique linear map $ : W —> U such that the following 
diagram commutes: 

P x V 2 -W W 

$ l A <!> 

U 

Proposition A. 2 For any two vector spaces V\ and V 2 , a tensor product 
of V\ and V 2 exists and is unique up to “canonical isomorphism. ” That is, 
for two tensor products (W\,T\) and {W 21 T 2 ), there is a unique invertible 
linear map ’F : W\ —> W 2 such that T 2 = o T\. 
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In light of the uniqueness result, we may speak of “the” tensor product of 
Pi and Vi. We choose any one tensor product and we denote it by Pi 0 Vi. 
We also denote the linear map T : V\ x Vi —> Vi 0 Vi as (u, v) ^ u 0 v. In 
this notation, the universal property reads as follows: Given any bilinear 
map $ of V\ x Vi into a vector space U , there exists a unique linear map 
<1 : Vl <S> Vi — > U such that 

<I>(m 0 v) = 3>(u, v). 

Proposition A.3 If V\ and Vi are finite-dimensional vector spaces with 
bases {uj }JI =1 and , then Pi 0 Vi is finite dimensional and the set 

of elements of the form Uj 0 Mfc, 1 < j < n\, 1 < k < n 2 , forms a basis for 
Pi (8 V 2 . In particular, 

dim(Vi 0 V 2 ) = (dim V \)(dim V 2 ). 

It should be emphasized that, in general, not every element of Pi 0 V 2 
is of the form u 0 v with u £ V i and v £ V 2 . All we can say is that each 
element of Vl 0 V 2 can be decomposed as a linear combination of elements 
of the form This decomposition, furthermore, is far from canonical; 

even in the finite-dimensional case, it depends on a choice of bases for V\ 
and V 2 . Nevertheless, the universal property of the tensor product tells us 
that we can define linear maps from V\ 0 V 2 to any vector space U , simply 
by defining them on elements of the form u 0 v. Provided that <f>(u,u) is 
bilinear in u and v, the universal property tells us that there is a unique 
linear map $ on Vl 0 V 2 such that on element of the form $ is equal 

to $(m,u). A representative application of the universal property is in the 
following result. 

Proposition A.4 If A £ End (Pi) and B £ End(pj), there exists a unique 
linear map A Cg) B : V± ® V 2 —> Pi <8> V 2 such that 

(A ® B)(u ®v) = (Au) ® (Bv). 

For Ai, A 2 £ End(Pi) and B±, B 2 £ End(P 2 ), we have 

(Ai 0 Bi)(A 2 0 B 2 ) = (AiA 2 ) 0 (BiBi). 

To construct A 0 B, we apply the universal property with U = Pi 0 P 2 
and 4 ’(m,u) = (Am) 0 (Bv). Since A and B are linear and 0 is bilinear, $ 
is bilinear. The linear map $ : Pi 0 P 2 —> Pi 0 P 2 is then the map that we 
denote A 0 B. 

The tensor product, as we have defined it in this section, applies to 
all vector spaces, whether finite dimensional or infinite dimensional. The 
construction, however, is purely algebraic; if there is a topology on Pi and 
P 2 , the tensor product takes no account of that topology. In the Hilbert 
space setting, then, we will have to refine the notion of the tensor product 
so that the tensor product of two Hilbert spaces will again be a Hilbert 
space. See Sect. A.4.5. 
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It is assumed that the reader is familiar with the basic notions of measure 
theory, including the concepts of cr-algebras, measures, measurable func¬ 
tions, and the Lebesgue integral. A triple ( X , Cl, /it), consisting of a set X , a 
cr-algebra 0 of subsets of X , and a (non-negative) measure p on Cl is called 
a measure space. A measurable function if : X —> C is said to be integrable 
if J x \ip\ dp < oo. The cr-algebra generated by any collection of subsets of a 
set X is the smallest cr-algebra of subsets of X containing that collection. 

We assume those parts of measure theory that are entirely standard: the 
monotone convergence and dominated convergence theorems, L p spaces, 
and Fubini’s theorem. We briefly review a few other topics that might not 
be as familiar. 

A measure p on a measurable space (X, p) is said to be cr-finite if X can 
be written as a countable union of measurable sets of finite measure. 

Definition A.5 Suppose p and v are two cr-finite measures on a measure 
space (X, 0). Then we say that p is absolutely continuous with respect 
to v if for all E G Cl, if v(E) = 0 then p(E) = 0. We say that p and v 
are equivalent if each measure is absolutely continuous with respect to the 
other. 

Theorem A.6 (Radon Nikodym) Suppose p and v are two cr-finite 
measures on a measure space ( X , Q) and that p is absolutely continuous 
with respect to u. Then there exists a non-negative, measurable function p 
on X such that 



for all E £ Cl. The function p is called the density of p with respect to v. 

Definition A.7 A collection A4 of subsets of a set X is called a mono¬ 
tone class if M. is closed under countable increasing unions and countable 
decreasing intersections. 

A countable increasing union means the union of a sequence Ej of sets 
where Ej is contained in Ej + i for each j, with a similar definition for 
countable decreasing intersections. 

Theorem A.8 (Monotone Class Lemma) Suppose Ad is a monotone 
class of subsets of a set X and suppose M. contains an algebra A of subsets 
of X. Then A4 contains the cr-algebra generated by A. 

Corollary A.9 Suppose p and v are two finite measures on a measure 
space (X,Cl). Suppose p and v agree on an algebra A C Cl. Then p and v 
agree on the cr-algebra generated by A. 

Note that in general, the collection of sets on which two measures agree 
is not a cr-algebra, nor even an algebra. 
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Theorem A.10 Suppose p is a measure on the Borel a-algebra in a locally 
compact, separable metric space X. Suppose also that p(K) < oo for each 
compact subset K of X. Then the space of continuous functions of compact 
support on X is dense in L P (X, p), for all p with 1 < p < oo. 

A word of clarification is in order here. If if is a continuous function on 
X with compact support, then f x \if\ p dp is finite, since if is bounded and 
p is finite on compact sets. Thus, we can define a map from C c {X) into 
L p (X,p) by mapping a continuous function if of compact support to the 
equivalence class [if]. The theorem is asserting, more precisely, that the 
image of C c (X) under this map is dense in L p (X,p). It should be noted, 
however, that the map if H > [if] need not be injective. After all, if there 
is a nonempty open set U inside X with p{U) = 0, then for any if with 
support contained in U , the equivalence class [if] will be the zero element of 
L P (X, p). Nevertheless, we will allow ourselves a small abuse of terminology 
and say that C c (X) is dense in L p (X,p). 


A.3 Elementary Functional Analysis 

In this section, we briefly review some of the results from elementary func¬ 
tional analysis that we make use of the text. Most of these results can be 
found in the book of Rudin [32] . 

A.3.1 The Stone-Weierstrass Theorem 

The Weierstrass theorem states that every continuous, real-valued function 
on an interval can be uniformly approximated by polynomials. A substan¬ 
tial generalization of this was obtained by Stone. If X is a compact metric 
space, let C(X;R) and C(X;C) denote the space of continuous real- and 
complex-valued continuous functions, respectively. A subset A of C(X;F) 
is called an algebra if it is closed under pointwise addition, pointwise mul¬ 
tiplication, and multiplication by elements of F, where F = R or C. An 
algebra A is said to separate points if for any two distinct points x and y 
in X, there exists / £ A such that /( x) A f(v)- We use on C(X;F) the 
supremum norm , given by 

ll/llsup : = SU P \f( x )\ » 

and C(X, F) is complete with respect to the associated distance function, 

d(f,g) = 11/ — S'lLup - 

Theorem A.11 (Stone—Weierstrass, Real Version) Let X be a com¬ 
pact metric space and let A be an algebra in C(X;R). If A contains the 
constant functions and separates points, then A is dense in C(A;R) with 
respect to the supremum norm. 
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Theorem A.12 (Stone Weierstrass, Complex Version) Let X be a 

compact metric space and let A be an algebra in C(X\C). If A contains the 
constant functions, separates points, and is closed under complex conjuga¬ 
tion, then A is dense in C ( X ; C) with respect to the supremum norm. 

A consequence of the complex version of the Stone-Weierstrass theorem 
is the following: If A is a compact subset of C, then every continuous, 
complex-valued function on K can be uniformly approximated by polyno¬ 
mials in z and z. 


A.3.2 The Fourier Transform 

We now describe the Fourier transform on R n , in various forms. 

Definition A.13 For any ip £ L 1 (R rl ), define the Fourier transform of 
ip to be the function ip on R” given by 

poo 

t/S(k) = {2ir)~ n/2 / e“ ik x ^(x) dx. 


Proposition A.14 For any ip £ L 1 (M n ), the Fourier transform ip of ip has 
the following properties: (1) ip( k) < {2i:)~ n / 2 ||'0|| i i , (2) ip is continuous, 
and (3) ip( k) tends to zero as |k| tends to oo. 


The bound on ip is obvious and the continuity of ip follows from dom¬ 
inated convergence. To show that ip tends to zero at infinity, we first es¬ 
tablish this on a dense subspace of L 1 (R n ) (e.g., the Schwartz space; see 
below) and then take uniform limits. 


Definition A.15 The Schwartz space <S(R") is the space of all C°° func¬ 
tions ip on R" such that 


lim |xj(9 k ^(x)| = 0 

£—>■±00 


for all n-tuples of non-negative integers j and k. Here if j = (ji,... ,j n ) 
then xJ = xj 1 ■ ■ ■ xpj and 



An element of the Schwartz space is called a Schwartz function. 

Proposition A.16 If ip belongs to 5(R"), then ip also belongs to <S(R”). 

The proof of this result hinges on the behavior of the Fourier transform 
under differentiation and under multiplication by x , results which are of 
interest in their on right. 
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Proposition A.17 If ip is a Schwartz function, the following properties 
hold 

1. We have _ 

J^-(k) =ik j xp(k). (A.l) 

2. The function ip is differentiable at every point and the Fourier trans¬ 
form of the function Xjip(x) is given by 

xjipik) = i-^-ffk). (A.2) 

The first point is proved by integration by parts and the second by dif¬ 
ferentiation under the integral in the definition of ip. 

Theorem A. 18 (Fourier Inversion and Plancherel Formula, I) The 

Fourier transform on 5(R ra ) has the following properties. 

1. The Fourier transform maps the Schwartz space onto the Schwartz 
space. 

2. For all ip € <S(R ra ), the function ip can be recovered from its Fourier 
transform by the Fourier inversion formula: 

/ OO 

e ik ' x V>(fc) dk. 

-OO 

3. For all ip € <S(R n ), we have the Plancherel theorem: 

[ |t/>(x)| 2 dx= [ |r/i(k )| 2 dk. 

jR n J R" 

Since the Schwartz space is dense in T 2 (R ra ), the BLT theorem and Theo¬ 
rem A. 18 imply that the Fourier transform extends uniquely to an isometric 
map of L 2 (R") onto L 2 (K n ). 

Theorem A. 19 (Fourier Inversion and Plancherel Theorem, II) 

The Fourier transform extends to an isometric map T of L 2 (R ra ) onto 
L 2 (R”). This map may be computed as 

F{ip){ k) = (27r)- n / 2 lim / e- ik ' x ^(x) dx, (A.3) 

a ^°°J\ x \<a 

where the limit is in the norm topology of L 2 (WL n ). The inverse map J 7-1 
may be computed as 

(J 7 ” 1 /) (x) = (2tt)-"/ 2 lim [ e ik x /(k) dk. 
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If ip belongs to L^R 11 ) D L 2 (R"), then by dominated convergence, the 
limit in coincides with the L 1 Fourier transform in Definition A. 13. 

Definition A.20 For two measurable functions <p and ip, define the con¬ 
volution <p * ip of cp and ip by the formula 

{<P * ip){x) = <p(x- y)ip(y) dy, 

provided that the integral is absolutely convergent for all x. 

Proposition A.21 Suppose that (p and ip belong to X 1 (R”)nZ/ 2 (R"). Then 
<p * ip is defined and belongs to L 1 (R rl ) D L 2 (R") and we have 

(: 2n)- n/2 T{(P T((p)T{ip). 

This result is proved by plugging cp* ip into the definition of the Fourier 
transform, writing e~ lkx as e _lky e _ * k '( x_y ), and using Fubini’s theorem. 
We will have occasion to use the following Gaussian integral. 

Proposition A.22 For all a > 0 and b £ C, we have 

i r°° 

/ e -z 2 /(2 a) e bx dx = ^ e ah 2 A 

V 27T J —oo 

Taking b = ik in the last part of the proposition gives us the Fourier 
transform of the Gaussian function e~ x 2a \ Taking b = 0 allows us to 
determine the proper normalization of the Gaussian probability density. 


A. 3.3 Distributions 

In this section we give a brief account of the theory of distributions—what 
physicists call “generalized functions”—including the notion of “derivative 
in the distribution sense.” 

The idea is that we study functions by studying their integral against 
some class of very nice “test functions.” Consider, for example, a locally 
integrable function / and consider integrals of the form 

[ X(x)/(x) dx, (A. 4) 

JR" 

where \ belongs to (^(R"), the space of smooth, compactly supported 
functions. We might think, for example, that \ is positive, has integral 
equal to 1, and is supported near some point a e R". In that case, the 
integral (A.4) is an approximation to the value of / at a, what physicists 
describe as a “smeared out” version of /(a). 
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Proposition A.23 Suppose f± and fi are locally integrable functions on 
K". If 



for all x G C£°(R"), then /i(x) = / 2 (x) for almost every x. 

The idea now is that we allow objects that do not have values at points, 
but for which something like (A.4) makes sense. Mathematically, we think 
of (A. 4) as a linear functional on 

Definition A. 24 A sequence \m G C£°(]R") is said to converge to \ £ 
Cf° (M n ) if (1) there exists a single compact set K containing the support 
of all the Xn’s, (2) Xm converges uniformly to x, and (3) each derivative 
of Xm converges uniformly to the corresponding derivative of x- 

Definition A. 25 A distribution on R ra is a linear map T : Cf°(R n ) —> C 
having the following continuity property: If Xm converges to % in the sense 
of Definition A. 24, T(xm) converges to T(x). 

The continuity condition on T should be regarded as a technicality, in 
that any functional that is well defined and linear on all of C^ 0 (M n ) and is 
obtained in a reasonably constructive fashion will satisfy this property. 

Example A.26 The Dirac 5-“function” is the distribution 5 defined by 


5{x) = X(0)- 


Definition A.27 If T is a distribution and f is a locally integrable func¬ 
tion, the expression “T is equal to f ” or ‘T is given by f ” means that 



for all x G Cf°(W n ). 

Definition A.28 IfT is a distribution, define the distribution dT/dxj by 
the formula 



ft is easy to verify that if T has the continuity property in Definition 
A.25, then so does dT/dxj. Furthermore, if T is given by a continuously 
differentiable function, then the derivative of T is in the distribution sense 
coincides with the derivative of T in the classical sense, as can easily be 
shown using integration by parts. If T is a distribution, we may define AT 
by repeated applications of Definition A.28, with the result that 


(A T)(x) = T( Ay). 
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Proposition A.29 If (ft and ift are L 2 functions, the equation dift/dxj = (ft 
holds in the distribution sense if and only if 



for all x G C£°(R"). Similarly, the equation A ift = (ft holds in the distribu¬ 
tion sense if and only if 


(A x , ift) = (x, (ft) 


for all X G C“(R"). 

Proposition A.30 IfT is a distribution on R and dT/dx is the zero dis¬ 
tribution, then T is a constant, meaning that there is some constant c such 
that 



(A.5) 


— OO 


Suppose, in particular, that if T is given by a locally integrable function /, 
and the derivative of T is zero. Then Proposition A.30 tells us that for some 
constant c, we have x( x )(f( x ) — c) dx = 0 for all x £ C£°(R). Then 
Proposition A.23 tells us that f(x) = c almost everywhere. This means that 
if the derivative of / is zero, even in the weak (or distributional) sense, then 
/ must be constant. 

A.3.4 Banach Spaces 

In this section, we define Banach spaces and describe some of their elemen¬ 
tary properties. 

Definition A.31 A norm on a vector space V over F (F = R or <C) is a 
map from V into R, denoted ift l— t HV'II ; with the following properties. 

1. For all ift G V, ||?/>|| > 0, with equality if and only if ift = 0. 

2. For all ift GV and c G F, we have \\cift\\ = \c\ \\ift\\ . 

3. For all (ft, ift G V, we have ||<^ + ^|| < ||</>|| + ||V , || ■ 

If ||-|| is a norm on V, then we can define a distance function d on V by 
setting d(<ft,ift) = \\ift - (ft\\ . 

Definition A.32 A normed vector space is said to be a Banach space 
if it is complete with respect to the associated distance function. A Banach 
space is said to be separable if contains a countable dense subset. 


One important class of examples of Banach spaces are the L p spaces. 
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Definition A.33 An infinite series, , ip n , with values in normed space 
V, is said to converge if there exists some L £ V such that 


lim || S n 

N—too 


L 11=0, 


where S N = X) n=1 V’n- 

Proposition A.34 IfV is a Banach space, then absolute convergence im¬ 
plies convergence in V. That is, if 

OO 

n^"ii < °°= 

71=1 


then 4 1 n converges in V. 


Definition A.35 If V\ and V 2 are normed spaces, a linear map T : V± —> 
V 2 is bounded if 


sup 

*l>eVi\{0} 


m\\ 

Ml 


< 00. 


(A.6) 


If T is bounded, then the supremum in (A.6) is called the operator norm 
of T, denoted ||T||. 


Theorem A.36 (Bounded Linear Transformation Theorem) LetV\ 
be a normed space and V 2 a Banach space. Suppose W is a dense subspace 
of V\ and T : W —» V 2 is a bounded linear map. Then there exists a unique 
bounded linear map T : Vi —> V 2 such that T\w = T. Furthermore, the 
norm of T equals the norm of T. 


Definition A.37 If V is a normed space over F (F = R or C), then a 
bounded linear functional on V is a bounded linear map of V into F, 
where on F we use the norm given by the absolute value. The collection of 
all bounded linear functionals, with the norm given by (A.6), is called the 
dual space to V, denoted V*. 


Theorem A.38 IfV is a normed vector space, then the following results 
hold. 


1. The dual space V* is a Banach space. 

2. For all ip £ V, there exists a nonzero £ £ V* such that 

In particular, if flip) = 0 for all t; £ V*, then ip = 0. 

Theorem A.39 (Closed Graph Theorem) Suppose that V\ is a Banach 
space and V 2 a normed vector space. For any linear map T : V) —> V 2 , let 
GraphiT) denote the set of pairs (ip, Tip) in V\ x V) such that ip £ V\. If 
the graph of T is a closed subset ofV\ x V 2 , then T is bounded. 
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Here is a simple example of how the closed graph theorem can be applied. 
Suppose V\ and V 2 are Banach spaces and T : V\ —> V 2 is a linear map that 
is one-to-one, onto, and bounded. Then the inverse map T^ 1 : V 2 —> V\ is 
automatically bounded. To verify this, we first check that if T is bounded, 
then the graph of T is closed (easy). Then we observe that the graph of 
T -1 is also closed, since it is obtained from the graph of T by the map 
(0, if) 1 —^ (0,0). Thus, the theorem tells us that T^ 1 is bounded. 

Theorem A.40 (Principle of Uniform Boundedness) Suppose {T a } 
is any family of bounded linear maps from a Banach space V\ to a normed 
space V2 . Suppose that for each if G Vi, there is a constant C ^ such that 
ll^aV’ll < CV for all a. Then there exists a constant C such that ||T a || < C 
for all a. 

That is, in contrapositive form, if the family {T a } is unbounded, {T a if} 
must be unbounded on if for some if GV 1 . 

Corollary A.41 Suppose V is a Banach space and E is a nonempty subset 
of V. Suppose that for all ^ G V* there exists a constant Cj such that 
|£(0)| < for all if G E. Then E is a bounded set. 

The corollary is obtained by identifying each if G V with the linear map 
: V* —> C given by evaluation on if-, that is, e^(^) = £(if). Note that by 
Point 2 of Theorem A.38, the norm of e^, as an element of V** is equal to 
the norm of if as an element of V. 
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A.4-1 Inner Product Spaces and Hilbert Spaces 

We now introduce a generalization to arbitrary vector spaces over i or € 
of the usual inner product (or dot product) on R n . 

Definition A.42 An inner product on a vector space over F (F = R or 
C) is a map (•, •) : V x V —> F with the following properties. 

1. For all (f, if G V, we have (if, 4) = ( 4>,if)■ 

2. For all cf G V, (cf, <f) is real and non-negative, and (<f, <p) = 0 only if 

0 = 0 . 

3. For all <f,if G F and c G F, we have (c<f,if) = c(<f,if) and (<f,cif) = 
c (0, if). 

4- F or all (f,if,x& V, we have (<f + if, \) = (<f, x) + (0, x) and 


(0, If + X) = (0, 0) + (0, X) ■ 
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Note that we are following the physics convention of taking the complex 
conjugate in Point 3 of the definition on the first factor in the inner product. 

Proposition A.43 If V is an inner product space, then for all <f, if 6 V, 
we have the Cauchy-Schwarz inequality: 

\{(f,if)\ 2 < {<f,(f){if,if) ■ 

Furthermore, if ||-|| : V —> R is defined by 

Ml = i/iW). ( A - 7 ) 


then ||-|| is a norm on V. 

Definition A.44 A Hilbert space is a vector space H over R or C, 
equipped with an inner product (•, •}, such that H is complete in the norm 
given by (A. 7). 

That is to say, a Hilbert space is a Banach space in which the norm 
comes from an inner product. In Appendix A.4 only, we allow H to denote 
an arbitrary Hilbert space over K or C. (In the main body of the text, H 
denotes a separable complex Hilbert space.) 

Definition A.45 Suppose H j is a sequence of separable Hilbert spaces. 
Then the Hilbert space direct sum. denoted 

OO 

H:=0H„ 

j=i 

is the space of sequences if = {if i, if 2 , if 3 , ■ ■ •) such that if n £ H n and such 
that 

OO 

Ill’ll 2 := MHf < 00 • ( A - 8 ) 

3 =1 

The finite direct sum of the ’s is the set of if = (ifi, if 2 , if 3 , ■ ■ •) such 
that ifj = 0 for all but finitely many values of j. 

We define an inner product on the direct sum by setting 

OO 

= ( A -9) 

3 =1 

for &\\ (f,if (zH. This inner product is well defined and H is complete with 
respect to this inner product, and hence a Hilbert space. 

One important example of a Hilbert space is L 2 (X, p), where (A, p) is a 
measure space. 
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Definition A.46 If (X, p) is a measure space, define an inner product on 
L 2 {X,p) by the formula 

= f <f{x)if{x) dfj,(x). (A.10) 

J x 

A standard result in measure theory states that the integral on the right- 
hand side of (A. 10) is absolutely convergent for all <f and if in L 2 {X,p). 
It is then easy to verify that (•,•) is indeed an inner product on L 2 (X,y). 
Another standard result states that L 2 (X,p) is complete with respect to 
the norm associated with the inner product in (A.10); thus, L 2 (X,y) is a 
Hilbert space. 


A.f.2 Orthogonality 

One reason that Hilbert spaces are nicer to work with than general Banach 
spaces is that we have the concept of orthogonality. 

Definition A.47 Two elements (f and if of an inner product space are 

orthogonal if (<f, if) = 0. 

Definition A.48 If V is any subspace of H, define a subspace V 1 - of H 
by 

H- 1 = {cf £ H| (</>, if) = 0 for all if £ V} . 

Then V ± is called the orthogonal space of V. 


Proposition A.49 


1. If V is a closed subspace of H, every if & H can be decomposed 
uniquely as if = if \ + if 2 , with if\ £ V and if 2 £ V ± . 

2. If V is any subspace of H, then (H^) -1- = V, where V is the closure 
of V. In particular, if V is closed, then (H 1 *) -1- = V. 

If V is closed, we call V the orthogonal complement of V. 


Definition A.50 A set {e 7 } of elements of H, where j ranges over an 
arbitrary index set, is said to be orthonormal if 


{ e j 1 e fc) 


0 j A k 
1 j = k 


An orthonormal set {ey} is an orthonormal basis for H if the space of 
finite linear combinations of the ej’s is dense in H. 


If H = L 2 ([— L, L]), for some positive number L, then the functions, 


ffn 


2-irinx/L 


n £ Z, 


(ATI) 


form an orthonormal basis for H. 
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Proposition A.51 Suppose {ej} is an orthonormal basis for H. Then ev¬ 
ery if can be expressed uniquely as a convergent sum 

^ a 3 e T- ( A - 12 ) 

3 

where the coefficients are given by aj = (ej , 't/j). If ^ is as in (A. 12), then 

ill’ll 2 = ^2\ a i\ 2 ■ 

3 

Finally, if (aj) is any sequence such that \ a j\ 2 < oo, there exists a 
unique if £ H such that (ej , if) = aj for all j. 

In the case that the orthonormal basis is the one in (A.11), the resulting 
series (A.12) is called the Fourier series of if. 


A.f.3 The Riesz Theorem and Adjoints 

We let 15(H) denote the space of bounded linear maps of H to H. It is not 
hard to show that B( H) forms a Banach space under the operator norm. 

Theorem A. 52 (Riesz Theorem) If f : H — > C is a bounded linear 
functional, then there exists a unique \ G H such that 

for all if£ H. Furthermore, the operator norm of f as a linear functional 
is equal to the norm of x as an element of H. 

We now turn to the concept of the adjoint of a bounded operator, along 
with the related concept of quadratic forms on H. 

Proposition A. 53 For any A £ 15(H), there exists a unique linear oper¬ 
ator A* : H —>• H, called the adjoint of A, such that 

(cf,Aif) = (A* cf,if) 

for all (f, if £ H. For all A, B £ 15(H) and a, ft £ C we have 

(A*)* = A 
(AB)* = B* A* 

(aA + @B)* = olA* + pB* 

I* =1. 


The operator A* is bounded and ||A*|| = ||A||. 
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Since A is a bounded operator, the map ip H > (cp, Aip) is a bounded linear 
functional for each fixed <p £ H. The Riesz theorem then tells us that there 
is a unique x G H such that (cp, Aip) = (x-, 4’) • The operator A* is defined 
by setting A*<p = x- It is n °t hard to check that this definition makes A* 
into a bounded linear operator. 

Definition A.54 An operator A £ £?(H) is said to be self-adjoint if 
A* = A and skew-self-adjoint if A* = — A. 

Definition A.55 An operator U on H is unitary if U is surjective and 
preserves inner products, that is, (Ucp,Uip) = (4>,ip) for all (p,ip £ H. 

If U is unitary, then U preserves norms (||b r V , ll = ||'0|| f° r all £ H); 
therefore, U is bounded with ||£/|| = 1. By the polarization identity (Propo¬ 
sition A.59), if U preserves norms, then it also preserves inner products. 

Proposition A.56 A bounded operator U is unitary if and only if U* = 
U~ x , that is, if and only if UU* = U*U = I. 

Proposition A.57 For any closed subspace V C H, there is a unique 
bounded operator P such that P = I on V and P = 0 on the orthogonal 
complement V^. This operator is called the orthogonal projection onto 

V and it satisfies P 2 = P and P* = P. 

Conversely, if P is any bounded operator on H satisfying P 2 = P and 
P* = P, then P is the orthogonal projection onto a closed subspace V, where 

V = range(P). 

A.4-4 Quadratic Forms 

In this section, we develop the theory of quadratic forms on Hilbert spaces. 
Since this is customarily done only for the inner product itself, we include 
the proofs of the results. 

Definition A.58 A sesquilinear form on H is a map L:HxHqC 

that is conjugate linear in the first factor and linear in the second factor. 
A sesquilinear form is bounded if there exists a constant C such that 

\L{(p,ip)\ < CUW HV'II 

for all (f>, tf) £ H. 

Proposition A.59 If L is a sesquilinear form on H, L can be recovered 
from its values on the diagonal (i.e., the value of ip) for various ip's) 
as follows: 

T(</>, V 1 ) = y [L(4 > + ip,<p + 4>)- L(<p, cp) - L(ip, ip)] 

~ ~ [ L (4> + itp,(p + H’) ~ <t>) ~ #)] • 


(A.13) 
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This formula is known as the polarization identity. 

Note that we do not assume any relationship between L(<p , ip) and L(ip , cp). 
Proof. Direct calculation. ■ 

Definition A.60 A quadratic form on a Hilbert space H is a map Q : 

H — > C with the following properties: (1) Q(Xip) = |A| 2 Q{ip) for all ip £ H 
and A £ C, and (2) the map L : H x H — > C defined by 

L ((p, ip) = - [Q(V + VO - Q{4>) - Q(V0] 

- ^ [<2(V> + H 0 - Q(V0 - Q(*V0] 

is a sesquilinear form. A quadratic form Q is bounded if there exists a 
constant C such that 

\Q(<P)\ < C\\cp\\ 2 

for all (p £ H. The smallest such constant C is the norm of Q. 

Proposition A.61 If Q is a quadratic form on H and L is the associated 
sesquilinear form, we have the following results. 

1. For all ip £ H, we have Q{ip) = L{ip,ip). 

2. If Q is a bounded, then L is bounded. 

3. If Q{ip) belongs to R for all ip £ H, then L is conjugate symmetric, 
that is, 

L{4>, ip) = L{ip,cp) 

for all <p, ip £ H. 

Proof. Point 1 of the proposition is verified by taking cp = ip in the expres¬ 
sion for L(<p,ip) and then using the relation Q(Xip) = |A| 2 Q(ip). For Point 
2, suppose \Q{'ip)\ < C||^|| 2 for all ip £ H. If ||</>|| = ||r/>|| = 1, then (p + ip 
and (p + iip have norm at most 2, and so 

| L[(p, ip )| < — G (4 + 1 + 1 + 4 + 1 + 1) = 6 C. 

Now, for any <p and ip in H, we can find unit vectors (p and ip such that 
cp = ll^ll (p and ip = H^ll ip. Then since L is assumed to be sesquilinear, we 
have 

\L{(p,ip)\ = ll^ll HV'II L (cp,ip) < 6C\\<p\\ HV’II, 

showing that L is bounded. 

For Point 3, assume that Q(ip) is real for all ip £ H and define a map 
M : H x H -> R by 

M (<t>, VO = ^ + VO - Q(V0 - Qii 0] = Re [L(<p, ip)). 
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Then M is real-bilinear (because it is the real part of L) and symmetric 
(because of the expression for M in terms of Q). Furthermore, M(i<p, iip) = 
M(<p,ip). These properties of M show that M(<p,iip) = —M(ip,i(p), and so 

L(<p, ip) = M (<p, ip) — %M(<p, iip) 

= M(ip, cp) + iM(ip, icp) 

= L(ip,<p), 

which is what we wanted to prove. ■ 

Example A.62 If A is a bounded operator on H, one can construct a 
bounded quadratic form Qa on H by setting 

Qa(iP) = (ip,Aip), ip £ H. 

The associated sesquilinear form La is then given by 

La(<P , ip) = {<P, Aip) , (p,ip eH. 

Proposition A.63 If Q is a bounded quadratic form on H, there is a 
unique A £ B(H) such that Q(ip) = (ip,Aip) for all ip £ H. If Q(ip) belongs 
to R for all ip £ H, then the operator A is self-adjoint. 

Proof. Since Q is bounded, L is also bounded, meaning that there exists 
a constant C such that \L(<p, ip)\ < C \\(p\\ \\ip\\ for all cp, ip £ H. Thus, for 
any <p £ H, the linear functional ip H > L(cp, ip) is bounded, with norm at 
most C\\(p\\ . By the Riesz theorem, then, there exists a unique x £ H, 
with Hxll < C||0||; such that L(cp,ip) = (x,ip) ■ We now define a map 
B : H -> H by defining B(p = y. Direct calculation shows that B is linear, 
and the inequality ||x|| < C ||^|| shows that B is bounded. Setting A = B* 
establishes the existence of the desired operator. Uniqueness of A follows 
from the observation that if (<p,Aip) = 0 for all <p,ip £ H, then A is the 
zero operator. 

If Q(ip) is real for all ip £ H, then by Point 3 of Proposition A.61, L is 
conjugate symmetric. Thus, 

(</>, Aip) = L(f>, ip) = L(ip, cp) = (ip, Acp) = (Acp, ip) 

for all <p,tp £ H, showing that A is self-adjoint. ■ 


A.4-5 Tensor Products of Hilbert Spaces 

Recall from Appendix A.l the concept of the tensor product of two vector 
spaces. 
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Proposition A.64 Suppose V± andV 2 are inner product spaces, with inner 
products (•,-) 1 and (-,-) 2 . Then there exists a unique inner product (•,•} on 
Pi 0 V 2 such that 


(Ui 0Vi,U 2 0 v 2 ) = {ui, u 2 ) 1 (vi 0 v 2 ) 2 
for all ui,u 2 £ Pi and v\,v 2 £ P 2 . 

If Hi and H 2 are Hilbert spaces, then we can equip the tensor product 
Hi ® H 2 with the inner product in Proposition A.64. If Hi and H 2 are both 
infinite dimensional, however, Hi ® H 2 will not be complete with respect 
to this inner product. Nevertheless, we can complete Hi ® H 2 with respect 
to this inner product, thus obtaining a new Hilbert space. 

Definition A.65 If Hi and H 2 are Hilbert spaces, then the Hilbert ten¬ 
sor product of Hi and H 2 , denoted Hi®H 2 , is the Hilbert space obtained 
by completing Hi ® H 2 with respect to the inner product in Proposition 
A. 64. 

Proposition A.66 If Hi and H 2 are Hilbert spaces with orthonormal 
bases {e.,} and {fk }, respectively, then {ej 0 fk} is an orthonormal basis 
for the Hilbert space Hi®H 2 . 

Proposition A.67 If A is a bounded operator on Hi and B is a bounded 
operator on H 2 , then there exists a unique bounded operator on Hi®H 2 , 
denoted A0 B, such that 

(A 0 B)(</) 0 if) = ( Ay i) 0 (Bif) 

for all <j) £ Hi and if £ H 2 . 

To see that A0B is bounded, first write A0B as (A01)(10B). Then, 
given any orthonormal basis {fj} for H 2 , we can decompose Hi< 8>H 2 as the 
Hilbert space direct sum of subspaces of the form Hi 0 fj. The operator 
A0 I acts on this decomposition as a block-diagonal operator with A in 
each diagonal block. From this, it is easy to verify that ||A 0 J|| = ||A||. A 
similar argument shows that || / = ||B||, and so 

||A®H||<||A®/||||7®H|| = ||A|| ||H||. 

Meanwhile, by taking a sequence of unit vector <j) n £ Hi and ip n £ H 2 
with ||A^ n || —> ||A|| and H-Bi/’nll — > ||I3|| , we see that the reverse inequality 
holds, and thus that ||A® B\\ = ||A|| ||H|| . 
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